April 29, 2026 6 min read AI · Engineering · Practitioner

Why I Don’t Use LangChain (And What I Use Instead)

LangChain is popular. It solves a real problem — orchestrating multiple LLM calls in a sequence. But popularity and usefulness aren’t the same. For production systems, I avoid it. Here’s why, and what I use instead.

What LangChain Does

LangChain abstracts over LLMs and their APIs. It gives you a unified interface for Claude, GPT, Llama, etc., “chains” for multi-step sequences, “agents” for agentic workflows, memory management, prompt templating, and document loaders.

In theory, this is great. In practice, it’s a problem.

Problem 1: Abstraction Leakage

LangChain abstracts over APIs that aren’t actually the same. Claude and GPT have different input/output formats, pricing models, token limits, features (structured outputs, vision, tool use), and error handling.

When you use LangChain, you’re forced to fit every model into the same interface. That means:

You can’t use model-specific features efficiently.
You spend time learning the abstraction, not the actual API.
When the abstraction leaks (and it will), you’re debugging a layer of indirection instead of understanding the real problem.

Example: Claude has structured outputs. GPT doesn’t (yet). LangChain doesn’t surface this well, so you end up manually handling the difference.

Problem 2: It Hides Complexity Instead of Managing It

The job of a framework is to manage complexity. LangChain tries to hide it.

Real example: I need to call Claude twice in sequence, with the second call depending on the first’s output. What does that actually look like?

With LangChain

from langchain.chains import SequentialChain
from langchain.prompts import PromptTemplate
from langchain.llms import Anthropic

llm = Anthropic()

prompt1 = PromptTemplate(...)
prompt2 = PromptTemplate(...)

chain1 = LLMChain(llm=llm, prompt=prompt1)
chain2 = LLMChain(llm=llm, prompt=prompt2)

sequential_chain = SequentialChain(
    chains=[chain1, chain2],
    input_variables=["input"],
    output_variables=["output"]
)

result = sequential_chain({"input": data})

Looks simple. But what’s actually happening? What errors can occur? How do you debug it?

With direct API calls

from anthropic import Anthropic

client = Anthropic()

# First call
response1 = client.messages.create(
    model='claude-opus-4-1-20250805',
    max_tokens=1000,
    messages=[{"role": "user", "content": f"Analyze this: {data}"}]
)

analysis = response1.content[0].text

# Second call, using first result
response2 = client.messages.create(
    model='claude-opus-4-1-20250805',
    max_tokens=500,
    messages=[{"role": "user", "content": f"Summarize: {analysis}"}]
)

summary = response2.content[0].text

It’s longer. But it’s explicit. I can see exactly what’s happening. I can add error handling exactly where I need it. I can inspect intermediate values. If something breaks, I know where.

Problem 3: Debugging a Multilayer Abstraction is Hell

I have a production job failing. LangChain says “error in chain execution.” Where’s the error?

Is it the prompt template?
Is it the API call?
Is it parsing the response?
Is it the conditional logic?

With LangChain, you dig through layers of framework code to find out. With direct API calls, you print the response and see it immediately.

At 2am, when a job is failing and you need to fix it now, layers of abstraction become layers of pain.

Problem 4: You Pay the Cost Without Reaping the Benefit

LangChain adds overhead. Each call goes through an abstraction wrapper, request serialization, the API call, response deserialization, and type parsing.

And what do you get? The option to swap Claude for GPT without rewriting code. In practice, you pick an LLM and stick with it. The abstraction cost wasn’t worth it.

When LangChain Actually Makes Sense

Prototyping quickly when you don’t know which model you’ll use.
Learning about LLM concepts (agents, chains, memory).
Building demos that need to “just work.”

It’s useful for exploration. For production, it gets in the way.

What I Use Instead

For Orchestration: Explicit Python

class InvoiceProcessor:
    def __init__(self):
        self.client = Anthropic()

    def process(self, invoice_pdf: bytes) -> dict:
        """Process invoice: extract data, validate, store."""

        # Step 1: Extract structured data
        extraction = self._extract_data(invoice_pdf)
        if not extraction:
            return {'status': 'failed', 'reason': 'extraction_failed'}

        # Step 2: Validate
        validation = self._validate(extraction)
        if not validation['valid']:
            return {'status': 'invalid', 'errors': validation['errors']}

        # Step 3: Store
        self._store(extraction)

        return {'status': 'success', 'data': extraction}

Explicit. Debuggable. Testable.

For Prompt Templates: f-strings

def generate_analysis(data: dict) -> str:
    prompt = f"""Analyze this data:

    Customer: {data['customer']}
    Orders: {len(data['orders'])}
    Total Spend: ${data['total']}

    What trends do you see? What's at risk?"""

    response = client.messages.create(
        model='claude-opus-4-1-20250805',
        max_tokens=500,
        messages=[{'role': 'user', 'content': prompt}]
    )

    return response.content[0].text

Simple. Clear. No special syntax.

For Agentic Workflows: Explicit Loop

def agentic_loop(goal: str, max_iterations: int = 5):
    context = []
    for i in range(max_iterations):
        response = client.messages.create(
            model='claude-opus-4-1-20250805',
            max_tokens=500,
            system='You are an agent. Decide your next action.',
            messages=[{'role': 'user', 'content': goal}, *context]
        )

        action = parse_action(response.content[0].text)
        if action['type'] == 'done':
            return action['result']

        result = execute_action(action)
        context.append({'role': 'assistant', 'content': response.content[0].text})
        context.append({'role': 'user', 'content': f"Action result: {result}"})

    return 'Max iterations reached'

You see exactly what’s happening. The agent loop is explicit. Debugging is straightforward.

The Trade-Off

LangChain buys you abstraction over models and built-in features (memory, prompt templates, etc.). You pay with complexity hiding, debugging difficulty, slower and less direct code, and vendor lock-in — not just to the LLM, but to LangChain’s patterns themselves.

For small scripts and prototypes, the trade-off isn’t worth it. Direct API calls are cleaner. For large, complex systems, you might want abstraction. But even then, I’d build a thin, custom layer over the API rather than use LangChain.

My Stack for Production AI

Anthropic SDK — direct API calls.
Pydantic — data validation.
Asyncio — concurrency.
pytest — testing.
Custom orchestration — explicit Python logic.

No frameworks. No hidden layers. Just code. It’s more lines than LangChain, but every line does exactly what I expect.

Get the free AI Readiness Checklist

15 questions to diagnose your team’s AI readiness, where you’ll see ROI fastest, and what to tackle first.

✓ Takes 5 minutes ✓ Actionable next steps ✓ No sales pitch

No spam. Unsubscribe anytime.

Ready to build AI that actually works?

Let’s talk about how SRE discipline transforms AI from a risky experiment into a reliable business system.

Book Your Free Discovery Call