Hugging Face Tutorial: Your Gateway to Open AI

The first time I opened Hugging Face, I was completely overwhelmed. Thousands of models. Datasets everywhere. Terminology I didn’t understand. It felt like walking into a library where the books were in a language I should know but didn’t.

Now, a few years later, I use Hugging Face almost daily. It’s become the center of my AI workflow—the place where I find models, test ideas, and deploy applications. And honestly? It’s not nearly as complicated as it first seemed.

This tutorial is what I wish I had when I started: a practical, jargon-light guide to getting real work done with Hugging Face. By the end, you’ll know how to find and download AI models, run them locally, work with datasets, and even deploy interactive demos. Let’s fix that overwhelming first-visit feeling.

What Is Hugging Face? (The GitHub of AI)

Think of Hugging Face as GitHub, but specifically for AI. It’s where the machine learning community shares their work so others can use it.

The Core Components

The Model Hub This is the main attraction—over 2 million pre-trained AI models, freely available. Want a model that translates text? Summarizes articles? Generates code? Someone has probably trained one and shared it here.

The Datasets Library Over 500,000 datasets for training and testing AI models. Everything from Wikipedia text to image collections to code repositories.

Spaces Interactive demos where you can try models without installing anything. About 1 million of these exist, and they’re great for testing before you commit to downloading.

The Transformers Library This is the software that makes it all work. It’s a Python library that standardizes how you load, run, and train AI models. One consistent interface for thousands of different models.

Why Developers Love It

The magic of Hugging Face is how much complexity it hides. Before it existed, using a pre-trained model might require reading an academic paper, finding the researcher’s code, fixing compatibility issues, and hoping you got the preprocessing right.

Now? It’s often three lines of code:

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
result = classifier("I love using Hugging Face!")

That’s a complete sentiment analysis application. The model downloads automatically. Preprocessing happens automatically. You just… use it.

Setting Up Your Hugging Face Account

You don’t strictly need an account to download public models, but you’ll want one. Here’s why and how to set it up.

Why Create an Account?

Access private or gated models (like Llama, which requires accepting a license)
Upload your own models to share or keep private
Track your downloads and favorites
Deploy Spaces for interactive demos

Creating Your Account

Go to huggingface.co
Click “Sign Up” in the top right
Use email or GitHub/Google authentication
Verify your email

Getting Your Access Token

This is important for programmatic access:

Log in to Hugging Face
Click your profile picture → Settings
Navigate to “Access Tokens”
Click “New token”
Name it (e.g., “my-laptop”) and select permissions
Copy and save it somewhere secure

You’ll use this token to authenticate when downloading gated models or uploading your work.

Logging In from Python

from huggingface_hub import login

# Interactive login (opens browser)
login()

# Or use your token directly
login(token="your_token_here")

Once logged in, your credentials are cached, so you won’t need to do this every time.

Installing the Essential Libraries

Let’s get your environment set up. I’ll show you the minimum needed, then the full toolkit.

Minimum Setup

pip install transformers torch

This gets you the Transformers library and PyTorch as the backend. You can substitute TensorFlow if you prefer:

pip install transformers tensorflow

Recommended Full Setup

pip install transformers torch datasets accelerate huggingface_hub

What each does:

transformers: The core library for loading and using models
torch: PyTorch deep learning framework
datasets: Easy access to Hugging Face datasets
accelerate: Optimizes training and inference
huggingface_hub: CLI tools and upload functionality

Verifying Installation

import transformers
import torch

print(f"Transformers: {transformers.__version__}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

If you have an NVIDIA GPU and CUDA available, PyTorch will automatically use it for faster inference.

Setting Up Your Cache

By default, models download to ~/.cache/huggingface/. This can get large (tens of GB). To change it:

import os
os.environ["HF_HOME"] = "/path/to/your/cache"

Or set it permanently in your shell profile.

Your First Model: Using Pipelines

Pipelines are the easiest way to use Hugging Face models. They handle all the details—you just describe what you want to do.

Available Pipeline Tasks

Here are some common tasks you can run with one line:

Task	What It Does
`sentiment-analysis`	Determines if text is positive/negative
`text-generation`	Continues or generates text
`summarization`	Condenses long text
`translation_en_to_fr`	Translates English to French
`question-answering`	Answers questions about text
`fill-mask`	Predicts missing words
`image-classification`	Classifies images
`text-to-speech`	Converts text to audio

Sentiment Analysis Example

from transformers import pipeline

# Load the default sentiment model
classifier = pipeline("sentiment-analysis")

# Analyze some text
results = classifier([
    "I love this product!",
    "This is the worst experience ever.",
    "It's okay, nothing special."
])

for result in results:
    print(f"{result['label']}: {result['score']:.3f}")

Output:

POSITIVE: 0.999
NEGATIVE: 0.998
NEGATIVE: 0.623

Text Generation Example

from transformers import pipeline

generator = pipeline("text-generation", model="gpt2")

result = generator(
    "The future of AI is",
    max_length=50,
    num_return_sequences=1
)

print(result[0]['generated_text'])

Summarization Example

from transformers import pipeline

summarizer = pipeline("summarization")

long_text = """
Hugging Face is a machine learning platform that has revolutionized 
how developers access and use AI models. With over 2 million models 
available on their hub, it has become the de facto standard for 
sharing and discovering pre-trained models. The company was founded 
in 2016 and has grown to support a massive community of AI developers.
"""

summary = summarizer(long_text, max_length=50, min_length=25)
print(summary[0]['summary_text'])

Choosing a Specific Model

Pipelines use default models, but you can specify which one:

# Use a specific model
generator = pipeline("text-generation", model="meta-llama/Llama-3.2-3B-Instruct")

# Use a specific model with GPU
generator = pipeline("text-generation", model="gpt2", device=0)  # GPU 0

Downloading Models with from_pretrained()

For more control, use from_pretrained() directly. This is how most production code works.

Basic Pattern

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "meta-llama/Llama-3.2-3B-Instruct"

# Load tokenizer and model separately
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Tokenize input
inputs = tokenizer("Hello, how are you?", return_tensors="pt")

# Generate
outputs = model.generate(**inputs, max_new_tokens=50)

# Decode output
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

The Auto Classes

Hugging Face has “Auto” classes that figure out the right model type:

AutoTokenizer - Loads the right tokenizer
AutoModel - Loads base models
AutoModelForCausalLM - For text generation (GPT-style)
AutoModelForSequenceClassification - For classification
AutoModelForQuestionAnswering - For Q&A

Using Auto classes means your code works with many different models without changes.

Downloading to Specific Locations

# Download to specific folder
model = AutoModel.from_pretrained("bert-base-uncased", cache_dir="./my_models")

# Download and save locally
tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.save_pretrained("./local_gpt2")

model = AutoModelForCausalLM.from_pretrained("gpt2")
model.save_pretrained("./local_gpt2")

# Later, load from local folder (works offline)
local_model = AutoModelForCausalLM.from_pretrained("./local_gpt2")

Handling Gated Models

Some models (like Llama) require accepting a license:

Visit the model page on Hugging Face
Read and accept the license
Request access (usually instant for most models)
Then download normally (you’ll need to be logged in)

# This works after accepting the license
from huggingface_hub import login
login()

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")

Working with Hugging Face Datasets

The datasets library makes it easy to access training and evaluation data.

Loading Datasets

from datasets import load_dataset

# Load a popular dataset
dataset = load_dataset("imdb")  # Movie reviews

# Access splits
train_data = dataset["train"]
test_data = dataset["test"]

# Look at a sample
print(train_data[0])

Available Datasets

Browse at huggingface.co/datasets. Some popular ones:

Dataset	Description
imdb	Movie reviews for sentiment
squad	Question-answering pairs
wikitext	Wikipedia text for language modeling
cnn_dailymail	News articles for summarization
common_voice	Audio for speech recognition

Working with Data

from datasets import load_dataset

# Load and preview
dataset = load_dataset("imdb")

# Filter
positive_reviews = dataset["train"].filter(lambda x: x["label"] == 1)

# Map a function
def add_length(example):
    example["text_length"] = len(example["text"])
    return example

dataset = dataset.map(add_length)

# Convert to pandas
df = dataset["train"].to_pandas()

Loading Custom Data

from datasets import load_dataset

# From CSV
dataset = load_dataset("csv", data_files="my_data.csv")

# From JSON
dataset = load_dataset("json", data_files="my_data.json")

# From local directory
dataset = load_dataset("imagefolder", data_dir="./images/")

Fine-Tuning Models (Beginner Introduction)

Fine-tuning means training a pre-trained model on your specific data. Here’s a simplified introduction.

When to Fine-Tune

The model is good but not perfect for your use case
You have labeled data specific to your domain
You need consistent outputs or specific knowledge

Basic Fine-Tuning Flow

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import Trainer, TrainingArguments
from datasets import load_dataset

# Load dataset
dataset = load_dataset("imdb")

# Load model and tokenizer
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# Tokenize data
def tokenize(example):
    return tokenizer(example["text"], truncation=True, padding="max_length")

tokenized_dataset = dataset.map(tokenize, batched=True)

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    evaluation_strategy="epoch",
)

# Create trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"].shuffle(seed=42).select(range(1000)),
    eval_dataset=tokenized_dataset["test"].shuffle(seed=42).select(range(200)),
)

# Train!
trainer.train()

This is a simplified example—production fine-tuning involves more careful data preparation and hyperparameter tuning

Saving Your Fine-Tuned Model

# Save locally
model.save_pretrained("./my_fine_tuned_model")
tokenizer.save_pretrained("./my_fine_tuned_model")

# Push to Hugging Face Hub
model.push_to_hub("my-username/my-fine-tuned-model")
tokenizer.push_to_hub("my-username/my-fine-tuned-model")

Hugging Face Spaces: Deploy Your AI

Spaces let you create interactive demos that run in the cloud—no infrastructure needed.

What You Can Build

Chat interfaces with Gradio
Web apps with Streamlit
Custom HTML/JS applications
Docker-based applications

Creating a Simple Space

Go to huggingface.co/spaces
Click “Create new Space”
Choose a framework (Gradio is easiest)
Give it a name
Add your code

Example Gradio App

Create a file called app.py:

import gradio as gr
from transformers import pipeline

classifier = pipeline("sentiment-analysis")

def analyze_sentiment(text):
    result = classifier(text)[0]
    return f"{result['label']}: {result['score']:.2%}"

demo = gr.Interface(
    fn=analyze_sentiment,
    inputs=gr.Textbox(placeholder="Enter text to analyze..."),
    outputs="text",
    title="Sentiment Analyzer",
    description="Analyze the sentiment of any text."
)

demo.launch()

Push this to your Space, and you’ll have a live demo anyone can use.

Best Practices and Tips

After using Hugging Face for years, here’s what I’ve learned:

Performance Tips

Use quantization for large models

model = AutoModelForCausalLM.from_pretrained(
    "model-name",
    load_in_8bit=True  # Reduces memory significantly
)

Enable Flash Attention when available

model = AutoModelForCausalLM.from_pretrained(
    "model-name",
    attn_implementation="flash_attention_2"
)

Use batching for multiple inputs

Storage Tips

Set HF_HOME to a drive with space
Clean cache periodically: huggingface-cli delete-cache
Use local_files_only=True once downloaded for offline use

Model Selection Tips

Check the model card for intended use cases
Look at download counts and likes as quality signals
Read the “Limitations” section carefully
Test with your specific data before committing

Frequently Asked Questions

Is Hugging Face free?

Yes, the platform, libraries, and most models are completely free. Some enterprise features and private Spaces have paid tiers.

How much storage do I need?

It depends on the models you use. A single large model can be 10-40GB. Clear unused models regularly.

Can I use Hugging Face models commercially?

Usually yes, but check each model’s license. Most use Apache 2.0 or MIT licenses. Some (like Llama) have specific commercial terms.

Do I need a GPU?

Not for small models or testing. Larger models benefit from GPUs. Cloud options like Colab give free GPU access.

How do I find the best model for my task?

Use the model Hub filters: select your task (e.g., “text-generation”), sort by downloads or likes, and read the model cards.

Conclusion

Hugging Face has genuinely democratized AI. What once required a research team, custom infrastructure, and months of work now takes a few lines of Python and a good internet connection.

Start simple: use pipelines to test ideas quickly. Graduate to from_pretrained() when you need more control. Eventually, try fine-tuning on your own data. The learning curve is gentle, and the community is helpful.

The best way to learn is to experiment. Pick a model that sounds interesting, load it up, and see what it can do. You might be surprised how much you can accomplish in an afternoon.

Ready to go deeper? Check out our guides on:

Best Open Source LLMs Ranked - Which models are worth your time
Run AI Locally with Ollama - Simpler local deployment
Llama 3 Guide - Deep dive on Meta’s flagship model