Hugging Face Tutorial: Your Gateway to Open AI
A complete beginner's guide to Hugging Face. Learn to download AI models, use pipelines, work with datasets, and deploy your first AI application.
The first time I opened Hugging Face, I was completely overwhelmed. Thousands of models. Datasets everywhere. Terminology I didn’t understand. It felt like walking into a library where the books were in a language I should know but didn’t.
Now, a few years later, I use Hugging Face almost daily. It’s become the center of my AI workflow—the place where I find models, test ideas, and deploy applications. And honestly? It’s not nearly as complicated as it first seemed.
This tutorial is what I wish I had when I started: a practical, jargon-light guide to getting real work done with Hugging Face. By the end, you’ll know how to find and download AI models, run them locally, work with datasets, and even deploy interactive demos. Let’s fix that overwhelming first-visit feeling.
What Is Hugging Face? (The GitHub of AI)
Think of Hugging Face as GitHub, but specifically for AI. It’s where the machine learning community shares their work so others can use it.
The Core Components
The Model Hub This is the main attraction—over 2 million pre-trained AI models, freely available. Want a model that translates text? Summarizes articles? Generates code? Someone has probably trained one and shared it here.
The Datasets Library Over 500,000 datasets for training and testing AI models. Everything from Wikipedia text to image collections to code repositories.
Spaces Interactive demos where you can try models without installing anything. About 1 million of these exist, and they’re great for testing before you commit to downloading.
The Transformers Library This is the software that makes it all work. It’s a Python library that standardizes how you load, run, and train AI models. One consistent interface for thousands of different models.
Why Developers Love It
The magic of Hugging Face is how much complexity it hides. Before it existed, using a pre-trained model might require reading an academic paper, finding the researcher’s code, fixing compatibility issues, and hoping you got the preprocessing right.
Now? It’s often three lines of code:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("I love using Hugging Face!")
That’s a complete sentiment analysis application. The model downloads automatically. Preprocessing happens automatically. You just… use it.
Setting Up Your Hugging Face Account
You don’t strictly need an account to download public models, but you’ll want one. Here’s why and how to set it up.
Why Create an Account?
- Access private or gated models (like Llama, which requires accepting a license)
- Upload your own models to share or keep private
- Track your downloads and favorites
- Deploy Spaces for interactive demos
Creating Your Account
- Go to huggingface.co
- Click “Sign Up” in the top right
- Use email or GitHub/Google authentication
- Verify your email
Getting Your Access Token
This is important for programmatic access:
- Log in to Hugging Face
- Click your profile picture → Settings
- Navigate to “Access Tokens”
- Click “New token”
- Name it (e.g., “my-laptop”) and select permissions
- Copy and save it somewhere secure
You’ll use this token to authenticate when downloading gated models or uploading your work.
Logging In from Python
from huggingface_hub import login
# Interactive login (opens browser)
login()
# Or use your token directly
login(token="your_token_here")
Once logged in, your credentials are cached, so you won’t need to do this every time.
Installing the Essential Libraries
Let’s get your environment set up. I’ll show you the minimum needed, then the full toolkit.
Minimum Setup
pip install transformers torch
This gets you the Transformers library and PyTorch as the backend. You can substitute TensorFlow if you prefer:
pip install transformers tensorflow
Recommended Full Setup
pip install transformers torch datasets accelerate huggingface_hub
What each does:
- transformers: The core library for loading and using models
- torch: PyTorch deep learning framework
- datasets: Easy access to Hugging Face datasets
- accelerate: Optimizes training and inference
- huggingface_hub: CLI tools and upload functionality
Verifying Installation
import transformers
import torch
print(f"Transformers: {transformers.__version__}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
If you have an NVIDIA GPU and CUDA available, PyTorch will automatically use it for faster inference.
Setting Up Your Cache
By default, models download to ~/.cache/huggingface/. This can get large (tens of GB). To change it:
import os
os.environ["HF_HOME"] = "/path/to/your/cache"
Or set it permanently in your shell profile.
Your First Model: Using Pipelines
Pipelines are the easiest way to use Hugging Face models. They handle all the details—you just describe what you want to do.
Available Pipeline Tasks
Here are some common tasks you can run with one line:
| Task | What It Does |
|---|---|
sentiment-analysis | Determines if text is positive/negative |
text-generation | Continues or generates text |
summarization | Condenses long text |
translation_en_to_fr | Translates English to French |
question-answering | Answers questions about text |
fill-mask | Predicts missing words |
image-classification | Classifies images |
text-to-speech | Converts text to audio |
Sentiment Analysis Example
from transformers import pipeline
# Load the default sentiment model
classifier = pipeline("sentiment-analysis")
# Analyze some text
results = classifier([
"I love this product!",
"This is the worst experience ever.",
"It's okay, nothing special."
])
for result in results:
print(f"{result['label']}: {result['score']:.3f}")
Output:
POSITIVE: 0.999
NEGATIVE: 0.998
NEGATIVE: 0.623
Text Generation Example
from transformers import pipeline
generator = pipeline("text-generation", model="gpt2")
result = generator(
"The future of AI is",
max_length=50,
num_return_sequences=1
)
print(result[0]['generated_text'])
Summarization Example
from transformers import pipeline
summarizer = pipeline("summarization")
long_text = """
Hugging Face is a machine learning platform that has revolutionized
how developers access and use AI models. With over 2 million models
available on their hub, it has become the de facto standard for
sharing and discovering pre-trained models. The company was founded
in 2016 and has grown to support a massive community of AI developers.
"""
summary = summarizer(long_text, max_length=50, min_length=25)
print(summary[0]['summary_text'])
Choosing a Specific Model
Pipelines use default models, but you can specify which one:
# Use a specific model
generator = pipeline("text-generation", model="meta-llama/Llama-3.2-3B-Instruct")
# Use a specific model with GPU
generator = pipeline("text-generation", model="gpt2", device=0) # GPU 0
Downloading Models with from_pretrained()
For more control, use from_pretrained() directly. This is how most production code works.
Basic Pattern
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "meta-llama/Llama-3.2-3B-Instruct"
# Load tokenizer and model separately
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Tokenize input
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
# Generate
outputs = model.generate(**inputs, max_new_tokens=50)
# Decode output
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
The Auto Classes
Hugging Face has “Auto” classes that figure out the right model type:
AutoTokenizer- Loads the right tokenizerAutoModel- Loads base modelsAutoModelForCausalLM- For text generation (GPT-style)AutoModelForSequenceClassification- For classificationAutoModelForQuestionAnswering- For Q&A
Using Auto classes means your code works with many different models without changes.
Downloading to Specific Locations
# Download to specific folder
model = AutoModel.from_pretrained("bert-base-uncased", cache_dir="./my_models")
# Download and save locally
tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.save_pretrained("./local_gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")
model.save_pretrained("./local_gpt2")
# Later, load from local folder (works offline)
local_model = AutoModelForCausalLM.from_pretrained("./local_gpt2")
Handling Gated Models
Some models (like Llama) require accepting a license:
- Visit the model page on Hugging Face
- Read and accept the license
- Request access (usually instant for most models)
- Then download normally (you’ll need to be logged in)
# This works after accepting the license
from huggingface_hub import login
login()
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
Working with Hugging Face Datasets
The datasets library makes it easy to access training and evaluation data.
Loading Datasets
from datasets import load_dataset
# Load a popular dataset
dataset = load_dataset("imdb") # Movie reviews
# Access splits
train_data = dataset["train"]
test_data = dataset["test"]
# Look at a sample
print(train_data[0])
Available Datasets
Browse at huggingface.co/datasets. Some popular ones:
| Dataset | Description |
|---|---|
| imdb | Movie reviews for sentiment |
| squad | Question-answering pairs |
| wikitext | Wikipedia text for language modeling |
| cnn_dailymail | News articles for summarization |
| common_voice | Audio for speech recognition |
Working with Data
from datasets import load_dataset
# Load and preview
dataset = load_dataset("imdb")
# Filter
positive_reviews = dataset["train"].filter(lambda x: x["label"] == 1)
# Map a function
def add_length(example):
example["text_length"] = len(example["text"])
return example
dataset = dataset.map(add_length)
# Convert to pandas
df = dataset["train"].to_pandas()
Loading Custom Data
from datasets import load_dataset
# From CSV
dataset = load_dataset("csv", data_files="my_data.csv")
# From JSON
dataset = load_dataset("json", data_files="my_data.json")
# From local directory
dataset = load_dataset("imagefolder", data_dir="./images/")
Fine-Tuning Models (Beginner Introduction)
Fine-tuning means training a pre-trained model on your specific data. Here’s a simplified introduction.
When to Fine-Tune
- The model is good but not perfect for your use case
- You have labeled data specific to your domain
- You need consistent outputs or specific knowledge
Basic Fine-Tuning Flow
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import Trainer, TrainingArguments
from datasets import load_dataset
# Load dataset
dataset = load_dataset("imdb")
# Load model and tokenizer
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
# Tokenize data
def tokenize(example):
return tokenizer(example["text"], truncation=True, padding="max_length")
tokenized_dataset = dataset.map(tokenize, batched=True)
# Define training arguments
training_args = TrainingArguments(
output_dir="./results",
num_train_epochs=3,
per_device_train_batch_size=8,
evaluation_strategy="epoch",
)
# Create trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset["train"].shuffle(seed=42).select(range(1000)),
eval_dataset=tokenized_dataset["test"].shuffle(seed=42).select(range(200)),
)
# Train!
trainer.train()
This is a simplified example—production fine-tuning involves more careful data preparation and hyperparameter tuning
Saving Your Fine-Tuned Model
# Save locally
model.save_pretrained("./my_fine_tuned_model")
tokenizer.save_pretrained("./my_fine_tuned_model")
# Push to Hugging Face Hub
model.push_to_hub("my-username/my-fine-tuned-model")
tokenizer.push_to_hub("my-username/my-fine-tuned-model")
Hugging Face Spaces: Deploy Your AI
Spaces let you create interactive demos that run in the cloud—no infrastructure needed.
What You Can Build
- Chat interfaces with Gradio
- Web apps with Streamlit
- Custom HTML/JS applications
- Docker-based applications
Creating a Simple Space
- Go to huggingface.co/spaces
- Click “Create new Space”
- Choose a framework (Gradio is easiest)
- Give it a name
- Add your code
Example Gradio App
Create a file called app.py:
import gradio as gr
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
def analyze_sentiment(text):
result = classifier(text)[0]
return f"{result['label']}: {result['score']:.2%}"
demo = gr.Interface(
fn=analyze_sentiment,
inputs=gr.Textbox(placeholder="Enter text to analyze..."),
outputs="text",
title="Sentiment Analyzer",
description="Analyze the sentiment of any text."
)
demo.launch()
Push this to your Space, and you’ll have a live demo anyone can use.
Best Practices and Tips
After using Hugging Face for years, here’s what I’ve learned:
Performance Tips
-
Use quantization for large models
model = AutoModelForCausalLM.from_pretrained( "model-name", load_in_8bit=True # Reduces memory significantly ) -
Enable Flash Attention when available
model = AutoModelForCausalLM.from_pretrained( "model-name", attn_implementation="flash_attention_2" ) -
Use batching for multiple inputs
Storage Tips
- Set
HF_HOMEto a drive with space - Clean cache periodically:
huggingface-cli delete-cache - Use
local_files_only=Trueonce downloaded for offline use
Model Selection Tips
- Check the model card for intended use cases
- Look at download counts and likes as quality signals
- Read the “Limitations” section carefully
- Test with your specific data before committing
Frequently Asked Questions
Is Hugging Face free?
Yes, the platform, libraries, and most models are completely free. Some enterprise features and private Spaces have paid tiers.
How much storage do I need?
It depends on the models you use. A single large model can be 10-40GB. Clear unused models regularly.
Can I use Hugging Face models commercially?
Usually yes, but check each model’s license. Most use Apache 2.0 or MIT licenses. Some (like Llama) have specific commercial terms.
Do I need a GPU?
Not for small models or testing. Larger models benefit from GPUs. Cloud options like Colab give free GPU access.
How do I find the best model for my task?
Use the model Hub filters: select your task (e.g., “text-generation”), sort by downloads or likes, and read the model cards.
Conclusion
Hugging Face has genuinely democratized AI. What once required a research team, custom infrastructure, and months of work now takes a few lines of Python and a good internet connection.
Start simple: use pipelines to test ideas quickly. Graduate to from_pretrained() when you need more control. Eventually, try fine-tuning on your own data. The learning curve is gentle, and the community is helpful.
The best way to learn is to experiment. Pick a model that sounds interesting, load it up, and see what it can do. You might be surprised how much you can accomplish in an afternoon.
Ready to go deeper? Check out our guides on:
- Best Open Source LLMs Ranked - Which models are worth your time
- Run AI Locally with Ollama - Simpler local deployment
- Llama 3 Guide - Deep dive on Meta’s flagship model