Back to articles
May 21, 2026

Getting Started with Hugging Face

The Hub for Open-Source AI Hugging Face has become the go-to platform for sharing, discovering, and using pre-trained machine learning models. Their Transformers library makes it possible to use…

man and woman hugging each other photographyPhoto: Priscilla Du Preez 🇨🇦 / Unsplash

The Hub for Open-Source AI

Hugging Face has become the go-to platform for sharing, discovering, and using pre-trained machine learning models. Their Transformers library makes it possible to use state-of-the-art NLP, vision, and audio models with just a few lines of code.

Installation and Setup

pip install transformers datasets torch

Quick Start: Text Classification

The pipeline API is the fastest way to get started. It abstracts away tokenization, model loading, and inference into a single function call.

from transformers import pipeline

# Create a pipeline with a pre-trained model
classifier = pipeline("sentiment-analysis")

result = classifier([
    "I absolutely love this new update!",
    "This is the worst experience I've ever had.",
    "The weather is okay, nothing special."
])

for item in result:
    print(f"{item['label']}: {item['score']:.4f}")
# POSITIVE: 0.9998
# NEGATIVE: 0.9987
# POSITIVE: 0.5842

Named Entity Recognition

Extract structured information from unstructured text:

from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
text = "Elon Musk founded SpaceX in 2002. He is also the CEO of Tesla."

entities = ner(text)
for entity in entities:
    print(f"  {entity['word']}: {entity['entity_group']} ({entity['score']:.4f})")
#  Elon Musk: PERSON
#  SpaceX: ORGANIZATION
#  2002: DATE
#  Tesla: ORGANIZATION

Fine-Tuning a Model

Pre-trained models are powerful, but fine-tuning on your own data unlocks domain-specific performance.

from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset

# Load a pre-trained model and tokenizer
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# Prepare your dataset
dataset = load_dataset("csv", data_files={"train": "train.csv", "test": "test.csv"})

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

# Define training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    evaluation_strategy="epoch",
)

# Train
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
)

trainer.train()
trainer.save_model("./fine-tuned-model")

The Hugging Face Hub

The Hugging Face Hub hosts over 900,000 models across NLP, computer vision, audio, and multimodal tasks. You can:

  • Search for models by task, language, or license
  • Upload your own models and datasets
  • Run inferences directly in the browser
  • Collaborate with the community
from huggingface_hub import login, list_models

# Login with your token
login("your_huggingface_token")

# Browse available models
models = list_models(task="text-classification", sort="downloads", direction=-1)
for model in list(models)[:5]:
    print(model.id)

Beyond NLP: Vision and Audio

Hugging Face supports much more than text:

# Image classification
from transformers import pipeline

image_classifier = pipeline("image-classification")
result = image_classifier("dog.jpg")

# Speech recognition
from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition")
result = transcriber("audio.wav")
print(result["text"])

Conclusion

Hugging Face has democratized access to cutting-edge AI. Whether you're a beginner trying your first inference or a researcher fine-tuning a large model, the platform provides the tools and community to get started quickly. Explore the Hub, experiment with different models, and contribute back to the open-source AI ecosystem.