Featured image for The LLM Landscape: Complete Model Family Tree (2026)
LLMs · · 12 min read · Updated

The LLM Landscape: Complete Model Family Tree (2026)

A comprehensive guide to all major large language model families. From GPT to Claude to Llama, understand the evolution and relationships of AI models.

When someone asks me “which AI model should I use?”, my answer is always: “It depends.” And understanding why requires knowing the landscape—where these models came from, how they relate to each other, and what makes each family unique.

The world of large language models has exploded. What started with a handful of research models in 2018 has grown into a complex ecosystem with dozens of significant players. Keeping track of it all has become its own challenge.

This is my attempt at a comprehensive map. I’ll walk you through every major LLM family, their evolution, and how they compare. By the end, you’ll understand not just which models exist, but why they exist and where they’re heading.

If you’re new to the concept of language models, start with our guide on what LLMs are before diving in.

A Brief History: 2017-2026

The modern LLM era began with a 2017 paper titled “Attention Is All You Need.” That paper introduced the transformer architecture that underlies every major model today.

The Key Milestones

2018: OpenAI releases GPT-1. It’s modest by today’s standards—117 million parameters—but proves the concept works.

2019: GPT-2 arrives with 1.5 billion parameters. OpenAI initially withholds it, fearing misuse. This seems almost quaint now.

2020: GPT-3 launches with 175 billion parameters and demonstrates few-shot learning. The age of practical LLMs begins.

2022: ChatGPT brings LLMs to the mainstream. Within months, everyone knows what AI chatbots are.

2023: The competition heats up. Claude, Bard (now Gemini), and open-source alternatives emerge. GPT-4 pushes boundaries further.

2024: Multimodal models become standard. Vision, audio, and text merge.

2025-2026: We see GPT-5, Claude 4, Gemini 3, and Llama 4. The gap between open and closed source narrows. Models become more efficient.

I’ve watched this evolution happen and still find the pace difficult to believe. Every time I think we’ve plateaued, something new arrives.

The OpenAI Family: GPT Lineage

OpenAI kickstarted the modern LLM era and remains a dominant player. Understanding their model family is essential.

The Genealogy

GPT-1 (2018) → GPT-2 (2019) → GPT-3 (2020)

                              GPT-3.5 (2022)

                               GPT-4 (2023)

                         GPT-4 Turbo (2023)

                             GPT-4o (2024)

                               GPT-5 (2025)

GPT-5: The Current Flagship

GPT-5 represents OpenAI’s current state-of-the-art, released in December 2025.

Key specs:

  • Context window: 128K tokens
  • Max output: 16K tokens
  • Multimodal: Text, vision, audio
  • Pricing: $0.01/1K input, $0.03/1K output

Variants:

  • GPT-5: Full model for complex reasoning
  • GPT-5-Turbo: Faster, cheaper for production use
  • GPT-5-Mini: Efficient model for simpler tasks

What OpenAI Does Well

In my experience, GPT models excel at:

  • General knowledge and broad capabilities
  • Following complex instructions
  • Code generation (especially with GPT-5-Codex)
  • Wide availability and ecosystem integration

Limitations to Know

  • Can be verbose
  • Less transparent about reasoning process
  • Higher costs at scale
  • Closed source limits customization

The ChatGPT Product

ChatGPT is OpenAI’s consumer product built on GPT models:

  • Free tier: GPT-5-Mini
  • Plus subscription: Full GPT-5 access
  • Custom GPTs: User-created specialized versions

For the detailed comparison with competitors, see our GPT-5 vs Claude 4 vs Gemini 3 breakdown.

The Anthropic Family: Claude Lineage

Anthropic, founded by former OpenAI researchers, has taken a different approach—prioritizing safety and helpfulness.

The Genealogy

Claude (2023) → Claude 2 (2023) → Claude 2.1 (2023)

                                  Claude 3 (2024)
                                  [Opus/Sonnet/Haiku]

                                Claude 3.5 Sonnet (2024)

                                  Claude 4 (2026)
                                  [Opus/Sonnet/Haiku]

Claude 4: The Current State

Claude 4 launched in January 2026 and represents a significant leap.

Key specs:

  • Context window: 200K tokens (expandable to 1M with extended context)
  • Max output: 8K tokens
  • Multimodal: Text, vision
  • Pricing (Sonnet): $0.003/1K input, $0.015/1K output

Variants:

  • Claude 4 Opus: Most capable, for complex tasks
  • Claude 4 Sonnet: Balanced performance/cost (default)
  • Claude 4 Haiku: Fast, lightweight, economical

What Claude Does Well

I find Claude particularly strong at:

  • Long-form analysis and writing
  • Nuanced, balanced perspectives
  • Following formatting instructions precisely
  • Being honest about uncertainty
  • Safer, less likely to produce harmful content

Limitations to Know

  • Smaller ecosystem than OpenAI
  • Sometimes overly cautious
  • Limited tool integrations compared to GPT
  • Vision capabilities slightly behind GPT-5

Constitutional AI

Anthropic’s training approach—Constitutional AI—shapes Claude’s behavior. The model is trained to be helpful, harmless, and honest. This creates a noticeably different “personality” than GPT models.

The Google Family: PaLM to Gemini

Google’s LLM journey has been eventful—from Bard’s rocky launch to Gemini’s strong competitive position.

The Genealogy

LaMDA (2021) → Bard (2023) [deprecated]

PaLM (2022) → PaLM 2 (2023)

            Gemini 1.0 (2023)

            Gemini 1.5 (2024)

            Gemini 2.0 (2024)

            Gemini 3 (2026)
            [Ultra/Pro/Flash]

Gemini 3: The Current State

Gemini 3, Google’s latest, launched in January 2026.

Key specs:

  • Context window: 2M tokens (Pro), 1M tokens (Flash)
  • Max output: 8K tokens
  • Multimodal: Text, vision, video, audio
  • Platform: Google AI Studio / Vertex AI

Variants:

  • Gemini 3 Ultra: Most capable, for advanced tasks
  • Gemini 3 Pro: Balanced general-purpose model
  • Gemini 3 Flash: Fast, efficient for production

What Gemini Does Well

Google’s integration advantages are real:

  • Massive context windows (up to 2M tokens)
  • Deep integration with Google services
  • Strong multimodal capabilities
  • Competitive pricing
  • Enterprise features through Vertex AI

Limitations to Know

  • Developer experience less polished than OpenAI
  • Consumer product (Gemini app) inconsistent
  • Behind Claude on nuanced analysis
  • Enterprise focus can neglect individual developers

The Bard Legacy

Google’s initial chatbot, Bard, launched prematurely and stumbled publicly. The rebrand to Gemini marked a fresh start. Bard is now fully deprecated.

The Meta Family: LLaMA to Llama 4

Meta has taken a radically different approach: open source. This has made Llama perhaps the most influential model family for the broader AI ecosystem.

The Genealogy

LLaMA (2023) → Llama 2 (2023) → Llama 3 (2024)
                  ↓                   ↓
            [Community forks]   Llama 3.1 (2024)
            - Vicuna                  ↓
            - Alpaca            Llama 4 (2025)
            - WizardLM          [405B/70B/8B]

Llama 4: The Current State

Llama 4 represents the current state of open-source LLMs.

Key specs:

  • Context window: 128K tokens
  • License: Meta Llama License (open for most uses)
  • Sizes: 8B, 70B, 405B parameters
  • Platform: Local, Hugging Face, Ollama

Variants:

  • Llama 4 405B: Competitive with commercial models
  • Llama 4 70B: Strong balance for capable hardware
  • Llama 4 8B: Runs on consumer hardware
  • Llama 4 Instruct: Fine-tuned for following instructions

What Llama Does Well

The open-source advantage creates unique benefits:

  • Free to use for most applications
  • Full customization through fine-tuning
  • Privacy—runs entirely on your infrastructure
  • No API costs at scale
  • Active community creating derivatives

Learn more in our best open source LLMs guide.

Limitations to Know

  • Requires technical expertise to deploy
  • Smaller models lag commercial offerings
  • No hosted API (must self-host or use third-party)
  • Commercial restrictions above certain user thresholds

The Open Source Ecosystem

Llama’s open release spawned an entire ecosystem. See the “Open Source Derivatives” section below for the family tree of community models.

Other Major Players

Beyond the big four, several other companies have built significant LLMs.

Mistral AI

The French startup has become the European leader in LLMs.

Key models:

  • Mistral Large 2: Competitive flagship model
  • Mixtral 8x22B: Efficient mixture-of-experts architecture
  • Mistral Small: Cost-effective option

Strengths: European data sovereignty, efficient architectures, competitive performance

Context: 128K tokens for Mistral Large 2

Cohere

Focused on enterprise applications with strong RAG (retrieval-augmented generation) capabilities.

Key models:

  • Command R+: Enterprise flagship
  • Command R: Balanced option
  • Embed: Specialized for embeddings

Strengths: Enterprise focus, multilingual support, RAG optimization

xAI (Grok)

Elon Musk’s AI company, integrated with X (Twitter).

Key models:

  • Grok 2: Current flagship
  • Grok 1.5: Previous version

Strengths: Real-time information access through X, distinct personality

Platform: Available through X Premium+

Others to Watch

  • Amazon Bedrock: AWS’s LLM services (not a model but provides model access)
  • Alibaba Qwen: Strong in Chinese and increasingly competitive globally
  • Baidu ERNIE: China’s leading commercial LLM
  • 01.ai Yi: Open-source Chinese models

Open Source Derivatives and Fine-Tunes

The open-source ecosystem has created an explosion of specialized models built on Llama and other foundations.

Llama Derivatives

Built on Meta’s Llama base:

ModelBased OnSpecialty
VicunaLlamaChatGPT-like behavior
WizardLMLlamaInstruction following
CodeLlamaLlamaCode generation
AlpacaLlamaInstruction following
OrcaLlamaReasoning

Mistral Derivatives

Built on Mistral’s open models:

ModelBased OnSpecialty
MixtralMistralMixture of experts
ZephyrMistralHelpfulness
Neural ChatMistralConversation

Quantized Models

For running on limited hardware, quantized versions reduce memory requirements:

  • Q4: 4-bit quantization, smallest but some quality loss
  • Q5: 5-bit, good balance
  • Q8: 8-bit, minimal quality loss

These enable running models like Llama 4 70B on consumer GPUs.

Where to Find Open Models

Quick Reference Comparison

Current Flagship Models (January 2026)

ModelProviderContextBest ForPricing (per 1M tokens)
GPT-5OpenAI128KGeneral tasks, code$10 in / $30 out
Claude 4 OpusAnthropic200K+Writing, analysis$15 in / $75 out
Gemini 3 ProGoogle2MLong context, Google integrationVaries
Llama 4 405BMeta128KSelf-hosted, privacyFree (compute costs)
Mistral Large 2Mistral128KEuropean enterpriseCompetitive

Best For Specific Use Cases

Use CaseRecommended ModelWhy
CodingGPT-5 or Claude 4Strong code understanding
Long documentsGemini 3 Pro2M context window
Privacy-sensitiveLlama 4Self-hosted
WritingClaude 4Nuanced, follows instructions
EnterpriseVaries by regionConsider data residency
BudgetGPT-5-Mini or HaikuEfficient models

Speed vs. Quality Trade-offs

PriorityModel Choice
Maximum qualityClaude 4 Opus, GPT-5, Gemini 3 Ultra
BalancedClaude 4 Sonnet, GPT-5-Turbo, Gemini 3 Pro
Speed/costClaude 4 Haiku, GPT-5-Mini, Gemini 3 Flash

Frequently Asked Questions

What is the best LLM in 2026?

There’s no single “best” LLM—it depends on your needs. GPT-5 offers the most polished general experience. Claude 4 excels at writing and analysis. Gemini 3 Pro has unmatched context length. Llama 4 is best for self-hosting and customization. Each family has models optimized for different trade-offs between capability, speed, and cost.

How do closed and open source LLMs differ?

Closed-source models (GPT-5, Claude 4, Gemini 3) are accessed via API—you pay per use but get the latest capabilities without infrastructure management. Open-source models (Llama 4, Mistral) can be self-hosted for free, offering privacy and customization but requiring technical expertise and compute resources.

Which model has the largest context window?

Gemini 3 Pro leads with up to 2 million tokens of context—enough to process multiple books at once. Claude 4 offers 200K tokens standard (expandable to 1M with extended context). GPT-5 and Llama 4 provide 128K tokens. Larger context windows enable processing longer documents but typically cost more.

Can I run LLMs locally on my own computer?

Yes, with appropriate hardware. Models like Llama 4 8B run on consumer hardware with 8-16GB RAM. Larger models need more resources—Llama 4 70B typically requires 40+ GB of GPU memory. Quantized versions reduce requirements. See our Ollama guide for getting started.

How often do new models release?

Major updates happen every 6-12 months for leading providers, with minor updates more frequently. The pace has been accelerating—we saw GPT-5 in late 2025, Claude 4 and Gemini 3 in early 2026. Expect continued rapid evolution through 2026.

Which LLM is most cost-effective?

For API usage, Claude 4 Haiku and GPT-5-Mini offer excellent value for simpler tasks. For high-volume usage, self-hosted Llama 4 can be most economical (though you pay for compute instead of tokens). The “cheapest” option depends on your volume, complexity needs, and infrastructure capabilities.

Conclusion

The LLM landscape in 2026 offers more choice than ever. Whether you need the polish of GPT-5, the nuance of Claude 4, the massive context of Gemini 3, or the freedom of Llama 4—there’s a model family suited to your needs.

My practical advice: don’t get too attached to any single model. The differences between leading models are smaller than they used to be, and the best choice often depends on specific use cases. Build your applications to be model-agnostic where possible.

The landscape will continue evolving. New families will emerge. Existing ones will improve. What matters is understanding the fundamental trade-offs—capability versus cost, closed versus open, speed versus quality—so you can make informed choices as the technology progresses.

Whether you’re building applications, exploring AI capabilities, or just trying to understand this space, I hope this map helps you navigate. The territory keeps changing, but now you know who’s in it and how they relate.

Found this helpful? Share it with others.

Vibe Coder avatar

Vibe Coder

AI Engineer & Technical Writer
5+ years experience

AI Engineer with 5+ years of experience building production AI systems. Specialized in AI agents, LLMs, and developer tools. Previously built AI solutions processing millions of requests daily. Passionate about making AI accessible to every developer.

AI Agents LLMs Prompt Engineering Python TypeScript