langchain#
What it is#
langchain is the Python framework for composing LLM calls into pipelines — prompts, models, parsers, retrievers, tools, and memory connected through the LangChain Expression Language (LCEL). It is by far the most widely-installed LLM framework on PyPI, and what most third-party tutorials and SDK examples target.
Since 2024 the project has been split into many packages rather than one monolith. The top-level langchain distribution is now mostly a thin aggregator that re-exports from langchain-core, plus assorted legacy chains and helpers. Real work happens in langchain-core and the per-provider partner packages.
Install#
pip install langchain
Output: installs the aggregator + core, but no model providers
pip install langchain-core langchain-openai
Output: the minimal modern stack — core abstractions + one provider
uv add langchain-core langchain-anthropic langchain-community
Output: dependencies resolved + added to pyproject.toml
poetry add langchain langchain-openai langchain-chroma
Output: updated lockfile + virtualenv install
pip install "langchain[all]" # NOT recommended — pulls hundreds of deps
Output: mega-install of every partner package the metapackage knows about
Versioning & Python support#
- Three breaking major lines in rapid succession:
0.1.x(early 2024),0.2.x(mid 2024),0.3.x(late 2024+). Each bump shuffled deprecations and partner-package boundaries. - The package is pre-1.0 indefinitely — minor bumps regularly remove deprecated symbols. Pin tight (
==) or narrow (~=) in production. - Python
3.9+on current releases;3.10+is the practical floor for most partner packages. langchain-corefollows its own version cadence, independent oflangchain— every partner package depends on alangchain-corerange, and skew between them is the #1 source ofImportErrorat runtime.- LCEL (
Runnable) replaced the legacyChain/Agent/LLMChainclasses in 0.1; those still import but emitLangChainDeprecationWarning.
Package metadata#
- Maintainer: LangChain Inc. + community (the
langchain-aiGitHub org) - Project home: github.com/langchain-ai/langchain
- Docs: python.langchain.com
- PyPI: pypi.org/project/langchain
- License: MIT
- Governance: commercially-backed (LangChain Inc.), open-source codebase, very active issue tracker
- First released: 2022
- Downloads: tens of millions per month across the family
Optional dependencies & extras#
The “family” is now what matters. Key partner packages on PyPI:
| Package | Purpose |
|---|---|
langchain-core | The pure abstractions — Runnable, prompts, messages, tools. Every other package depends on this. |
langchain | Aggregator + legacy chains/agents. Useful for tutorials, optional in production. |
langchain-community | 200+ community-maintained integrations (vector stores, document loaders, niche LLMs). Heavy and fast-moving — install only when you need it. |
langchain-openai | OpenAI and Azure OpenAI chat, completion, embeddings. |
langchain-anthropic | Anthropic Claude chat + tool use. |
langchain-google-genai | Google Gemini via the google-generativeai SDK. |
langchain-google-vertexai | Gemini and PaLM via Vertex AI (GCP auth). |
langchain-chroma | ChromaDB vector store integration. |
langchain-text-splitters | Chunking utilities pulled out of langchain proper. |
langgraph | Sibling project for stateful graph-based agents (separate repo, separate release line). |
langchain itself defines minor [extra]s ([llms], [embeddings], [vectorstores], [all]) but these are mostly legacy — install partner packages directly rather than relying on extras.
Alternatives#
| Package | Trade-off |
|---|---|
llama-index | RAG-first framework. More opinionated index/query abstractions; smaller agent surface. |
haystack-ai | Production-grade pipeline framework from deepset. Pipeline graph is more explicit than LCEL. |
dspy | Declarative LLM programming — optimises prompts and few-shots automatically. Different paradigm entirely. |
langgraph | Same maintainers; stateful agent graphs. Use alongside LangChain when LCEL is too linear. |
Provider SDKs directly (openai, anthropic, google-generativeai) | No abstraction overhead. Use when you only need one provider and don’t want a framework. |
Common gotchas#
- ABI breakage between 0.1 / 0.2 / 0.3. Code from a 2024 tutorial may not import at all under current
langchain— class names moved, modules vanished, partner packages were extracted. Always check the docs version pin matches your install. - Partner-package version drift. A
langchain-openaithat was current six months ago may require an olderlangchain-corethan yourlangchainresolves. Re-install the family together:pip install -U langchain langchain-core langchain-openai. - Deprecation warnings are deafening. Old
LLMChain,ConversationChain,initialize_agent, etc. all still work but emitLangChainDeprecationWarningon every call. Either migrate to LCELRunnablechains or silence withwarnings.filterwarnings("ignore", category=LangChainDeprecationWarning). langchain-communityis huge. It carries optional dependencies for hundreds of integrations and frequently breaks on Python version bumps. Prefer the specific partner package (langchain-chroma,langchain-pinecone) where one exists.- LCEL learning curve. The
|pipe operator returnsRunnableobjects with.invoke(),.stream(),.batch(),.ainvoke(), etc. Mixing LCEL with legacy.run()/.predict()calls is a frequent source of confusion. trust_remote_codeand tool-use security. Tools loaded fromlangchain-communitymay execute arbitrary code (shell tools, Python REPL tools). Treat them like any other code-execution path.- Caching defaults are off.
set_llm_cache(InMemoryCache())must be called explicitly — otherwise every identical call re-hits the model provider.
Ecosystem integrations#
The LangChain ecosystem is now a constellation of partner packages rather than a single library. Knowing which package owns a given integration saves a lot of pip install thrash.
| Domain | Packages |
|---|---|
| Model providers | langchain-openai, langchain-anthropic, langchain-google-genai, langchain-google-vertexai, langchain-cohere, langchain-mistralai, langchain-aws (Bedrock), langchain-fireworks, langchain-together, langchain-groq, langchain-ollama, langchain-huggingface |
| Vector stores | langchain-chroma, langchain-pinecone, langchain-weaviate, langchain-qdrant, langchain-postgres (pgvector), langchain-milvus, langchain-elasticsearch, langchain-redis, langchain-mongodb, langchain-astradb |
| Document loaders / chunkers | langchain-text-splitters, langchain-unstructured, langchain-community (the long tail) |
| Agents / graphs | langgraph, langgraph-checkpoint-postgres, langgraph-checkpoint-sqlite |
| Observability | langsmith (vendor SDK); OpenTelemetry instrumentation through community packages |
| Experimental | langchain-experimental — agents, parsers, and chains that aren’t ready for the stable surface |
Rule of thumb: if a partner package exists for your integration, use it. The same class re-exported from langchain-community is older and tends to drag in heavier deps.
# A typical RAG stack today
pip install langchain-core langchain-openai langchain-anthropic \
langchain-postgres langchain-text-splitters langsmith
Output: minimal modern stack — no langchain-community, no langchain aggregator
Real-world recipes#
These are LCEL patterns that show up over and over in production projects. Each recipe leans on Runnable composition — the same primitives — so combinations come together cleanly.
Recipe: RunnableParallel for fan-out / fan-in#
RunnableParallel (also written as a dict literal in LCEL) runs sub-chains concurrently and gathers results into a dict. Use it when an upstream input feeds multiple branches whose outputs you then merge.
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
model = ChatAnthropic(model="claude-sonnet-4-6")
summarise = ChatPromptTemplate.from_template("Summarise: {text}") | model | StrOutputParser()
classify = ChatPromptTemplate.from_template("Topic of: {text}") | model | StrOutputParser()
pipeline = RunnableParallel(
summary=summarise,
topic=classify,
original=RunnablePassthrough(),
)
print(pipeline.invoke({"text": "..."}))
Output: {"summary": "...", "topic": "...", "original": {...}} with both LLM calls issued in parallel.
Recipe: structured output via with_structured_output#
Most modern providers expose native tool calling that LangChain wraps as model.with_structured_output(schema). Skip JSON-mode hacks where you can.
from pydantic import BaseModel, Field
from langchain_anthropic import ChatAnthropic
class Invoice(BaseModel):
vendor: str = Field(description="Issuing company name")
total_cents: int = Field(description="Total in cents")
extractor = ChatAnthropic(model="claude-sonnet-4-6").with_structured_output(Invoice)
print(extractor.invoke("Receipt: Acme Inc., $14.99 charged to card."))
Output: Invoice(vendor='Acme Inc.', total_cents=1499) — provider-native function calling under the hood.
Recipe: agent loop with LangGraph#
create_tool_calling_agent + AgentExecutor is the legacy agent API; new code should use LangGraph when the agent has any state beyond “tool/no-tool”. The simplest LangGraph agent is two lines:
from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
@tool
def search(query: str) -> str:
"""Search the web. Returns top-3 snippets."""
return f"results for {query}"
agent = create_react_agent(ChatAnthropic(model="claude-sonnet-4-6"), tools=[search])
out = agent.invoke({"messages": [("user", "What was the closing price of META yesterday?")]})
print(out["messages"][-1].content)
Output: model issues a search tool call, observes the stub result, and produces a final answer in out["messages"][-1].
Recipe: streaming an LCEL chain to a FastAPI endpoint#
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
app = FastAPI()
chain = prompt | ChatAnthropic(model="claude-sonnet-4-6") | StrOutputParser()
@app.post("/chat")
async def chat(payload: dict):
async def gen():
async for chunk in chain.astream({"input": payload["msg"]}):
yield chunk
return StreamingResponse(gen(), media_type="text/plain")
Output: every connected client receives tokens incrementally; the chain’s astream is the only async/streaming surface needed.
Recipe: batched parallel invocation with bounded concurrency#
results = chain.batch(
inputs=[{"text": t} for t in many_documents],
config={"max_concurrency": 8}, # cap simultaneous in-flight requests
return_exceptions=True, # don't fail the whole batch on one error
)
Output: results list aligns one-to-one with many_documents; exceptions are returned in-place instead of raising.
Cost & rate-limit management#
LangChain inherits cost dynamics from whichever provider it’s wrapping; the framework’s job is to make per-call cost observable and to keep retries from melting your budget.
- Set
max_tokenson every chat model. The default for some providers is unbounded — a single runaway agent loop can burn through dollars before you see the bill. - Use
get_openai_callback()/ equivalent. For OpenAI-class models,with get_openai_callback() as cb:accumulates token usage across a chain invocation. Equivalents exist for Anthropic via callback handlers. - Cache identical prompts.
langchain_community.cache.SQLiteCacheandRedisCacheare drop-in. Wire once withset_llm_cache(SQLiteCache(database_path=".lc-cache.db"))and every identical(model, prompt, params)tuple skips the provider. - Model-selection ladders. Route easy prompts to a small/cheap model, hard ones to a flagship. The pattern is a
RunnableBranchover a cheap classifier. - Exploit provider prompt caching. Anthropic, Gemini, and OpenAI all support some form of prompt caching for stable prefixes. Put your large system prompt first and keep it byte-stable across calls to maximize cache hits.
- Bound concurrency in
batch(). Withoutmax_concurrency,batchfires every input at once and trips per-minute quotas. Set it to your provider’s safe parallel limit. - Retry with backoff via
model.with_retry()— exponential backoff, capped attempts. Beats rawtenacitybecause it knows which errors are retryable.
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(
model="claude-sonnet-4-6",
max_tokens=512,
timeout=30,
).with_retry(stop_after_attempt=4, wait_exponential_jitter=True)
Output: model now auto-retries 4× on transient 5xx / rate-limit errors with jittered exponential backoff.
Version migration guide#
LangChain went through three breaking lines (0.1 → 0.2 → 0.3) in roughly twelve months. The high-level direction was always the same — pull provider integrations out of the monolith — but it broke imports each time.
| Era | What changed | What to watch for |
|---|---|---|
0.0.x | Pre-LCEL — LLMChain, ConversationChain, initialize_agent were the public API. | Pre-2024 tutorials are this; almost nothing imports the same way on modern releases. |
0.1.x (early 2024) | LCEL became the primary surface. Provider classes started moving to partner packages. | Imports like from langchain.chat_models import ChatOpenAI were re-pointed to langchain_openai. |
0.2.x (mid 2024) | Aggressive partner-package extraction. langchain itself thinned out. | Many tools and retrievers moved to langchain-community; community moved a lot to dedicated partner packages. |
0.3.x (late 2024+) | Pydantic v2 internals, more partner packages, removal of long-deprecated symbols. | langchain_core.pydantic_v1 shim removed in some paths; migrate to native Pydantic v2. |
Migration rules of thumb:
- Re-install the family in one shot —
pip install -U langchain langchain-core <partner>...— to avoid stale partner pins. - Search the codebase for
from langchain.imports; most current code prefersfrom langchain_core.,from langchain_openai., etc. - Replace
LLMChain(prompt=..., llm=...)withprompt | llm | parser. - Replace
initialize_agent(...)withcreate_tool_calling_agentfor stable use cases, or migrate to LangGraph for stateful agents. - If
RunnableWithMessageHistorycomplains about Pydantic v1 schemas, regenerate any subclasses against Pydantic v2.
Hedge: exact removal points for individual deprecated symbols are best confirmed against the current langchain release notes — the project has been known to keep deprecations longer than initially announced.
Troubleshooting common errors#
langchain failure modes cluster around partner-package skew, deprecated APIs, and silent type mismatches in LCEL chains. The shortlist below covers the noisy ~80%.
ImportError: cannot import name 'X' from 'langchain.Y'— almost always a partner-package rename. Search the current docs for the symbol; the answer is usuallyfrom langchain_<provider>.Y import Xorfrom langchain_core.Y import X.pydantic.v1.error_wrappers.ValidationErrorvspydantic.ValidationError— you’re mixing Pydantic v1 shim classes (langchain_core.pydantic_v1) with native v2 models. Pick a side; convert with.model_dump()at the boundary.AttributeError: 'AIMessage' object has no attribute 'strip'— a parser expected a string but the previous step returned a chat message. Either addStrOutputParser()upstream or unwrap with.content.- Chain hangs in
batch()— provider rate-limit lockout. Lowermax_concurrencyand add.with_retry(...). KeyErrorinRunnableParallel— downstream step references a key that an upstream branch didn’t produce.print(chain.input_schema.schema())/output_schema.schema()shows the wire-format dict.Could not resolve type ...— happens with custom subclasses ofRunnableunder Pydantic v2. Addmodel_config = ConfigDict(arbitrary_types_allowed=True).- Deprecation noise floods stdout — wrap import sites with
warnings.filterwarnings("ignore", category=LangChainDeprecationWarning)or migrate. Don’t pin to an old release just to silence warnings.
When NOT to use this#
langchain earns its keep when you genuinely benefit from provider abstraction, LCEL composition, or the partner-package ecosystem. It is overkill — and frankly slower to build — for several common shapes:
- Single-call use cases. “Take this string, send to one model, get a string back” is two lines with the provider SDK. LangChain adds three packages and indirection.
- Custom protocols. If you’re building an unusual wire protocol on top of a single provider, the framework’s abstractions get in the way more than they help.
- Pure RAG with a single vector store. A short script reading from one vector store, embedding once, and prompting once is often clearer without LCEL.
- Stateful agents with intricate control flow. Reach for LangGraph directly instead. LangChain’s
AgentExecutoris the legacy path and is harder to debug than LangGraph’s explicit graph nodes. - Optimised prompts. DSPy treats prompts and few-shots as optimisable artefacts; LangChain treats them as templates.
- Production inference at extreme scale. Provider SDKs (or vLLM behind your own server) skip framework overhead and give you tighter control of batching.
A practical heuristic: if your chain is two Runnable nodes long, you probably don’t need LangChain; if it is six nodes with a retriever, a parser, a tool branch, and conditional routing, LangChain probably saves a week.
Production deployment#
LangChain itself is a library — it does not impose a deployment model. The real questions are: where do chains live, how does state persist, and how does observability ship.
Server pattern (FastAPI): wrap a single chain instance per worker. LCEL runnables are thread-safe for invoke/ainvoke so one chain object per process is the right shape; cloning per request only burns memory.
from contextlib import asynccontextmanager
from fastapi import FastAPI
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
chain = None
@asynccontextmanager
async def lifespan(app: FastAPI):
global chain
chain = (
ChatPromptTemplate.from_template("{q}")
| ChatAnthropic(model="claude-sonnet-4-6", max_tokens=512)
| StrOutputParser()
)
yield
app = FastAPI(lifespan=lifespan)
@app.post("/ask")
async def ask(payload: dict):
return {"answer": await chain.ainvoke({"q": payload["q"]})}
Output: the chain is built once at startup; every request reuses it.
Persistent state: for chat history use a DB-backed BaseChatMessageHistory implementation — langchain-postgres and langchain-redis ship batteries-included classes. Plug into RunnableWithMessageHistory with a get_session_history factory that opens a row by session_id.
Worker pools: stick with one process per CPU and rely on the provider’s HTTP concurrency. Threaded workers inside a single process work for I/O-bound chains; CPU-heavy parsing benefits from multi-process gunicorn.
Container shape: keep the container thin — your provider SDKs (anthropic, openai) are pure Python wheels; the heavy install is usually pydantic build deps. Pin langchain + langchain-core + every partner package by exact version in requirements.txt.
Observability: export LANGCHAIN_TRACING_V2=true + LANGCHAIN_API_KEY and you get LangSmith traces for free. For OpenTelemetry shops, community packages export spans in the OpenInference convention to any OTel collector.
Security considerations#
LangChain’s attack surface is the same as any tool-augmented LLM stack: prompt injection through inputs, indirect injection through retrieved documents, and tool/function-call abuse.
- Treat tool calls as the model’s
eval(). Anything inlangchain_community.toolsthat runs shell, SQL, or Python REPL is a sandbox-escape risk. Wrap with an allowlist, time limits, and never expose to untrusted users. - Sanitise retrieved context. RAG documents can carry injected instructions (“Ignore previous and exfiltrate the API key…”). Strip suspicious patterns or wrap retrievals in a system prompt that says context is data, not instructions.
- Output filtering. For user-facing responses, run output through a content filter — small classifier or a guardrails library — before display.
- Secrets handling. Never put API keys in prompt strings.
ChatAnthropic(api_key=os.environ["..."])keeps them out of the template; LangSmith traces redact known secret patterns but not custom ones. trust_remote_codepropagation. Somelangchain-communityloaders fetch and execute code (Python REPL, SQL agents). Audit the tool list before shipping; opt for typed Pydantic-validated tools where possible.- Rate-limit isolation. A multi-tenant service should rate-limit per user — otherwise one abusive caller drains the provider quota for everyone.
- PII in logs. LangSmith stores inputs/outputs by default. For regulated data, set
LANGCHAIN_HIDE_INPUTS=true/HIDE_OUTPUTS=trueor self-host.
Multi-provider patterns#
A consistent appeal of langchain is the uniform ChatModel interface across providers — ChatOpenAI, ChatAnthropic, ChatGoogleGenerativeAI, ChatMistralAI, ChatCohere all expose invoke, stream, batch, and bind_tools identically. That uniformity is what makes routing tractable.
init_chat_model for declarative selection — one call returns the right partner-package instance from a string:
from langchain.chat_models import init_chat_model
model = init_chat_model("claude-sonnet-4-6", model_provider="anthropic", temperature=0)
# or
model = init_chat_model("gpt-4o-mini", model_provider="openai")
Output: behaves as the corresponding ChatAnthropic / ChatOpenAI — useful for config-driven model selection.
RunnableConfigurableFields for runtime swapping:
from langchain.chat_models import init_chat_model
from langchain_core.runnables import ConfigurableField
multi = init_chat_model(
"claude-sonnet-4-6", model_provider="anthropic"
).configurable_alternatives(
ConfigurableField(id="model"),
default_key="claude",
gpt=init_chat_model("gpt-4o-mini", model_provider="openai"),
)
chain = prompt | multi | StrOutputParser()
print(chain.invoke({"q": "..."})) # claude
print(chain.invoke({"q": "..."}, config={"configurable": {"model": "gpt"}}))
Output: same chain, two model backends, no rebuilding.
Failover with with_fallbacks:
primary = ChatAnthropic(model="claude-sonnet-4-6")
backup = ChatOpenAI(model="gpt-4o-mini")
resilient = primary.with_fallbacks([backup])
Output: if the primary errors (rate-limit, 5xx), the call falls through to the backup transparently.
LiteLLM proxy for organization-wide control. When you want central rate-limiting, key rotation, and cost dashboards across teams, deploy a LiteLLM proxy and point all LangChain ChatOpenAI-compatible clients at its base URL. LangChain talks OpenAI’s wire format to anything that speaks it.
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="claude-sonnet-4-6", base_url="http://litellm-proxy/v1", api_key="sk-team-x")
Output: every team’s traffic flows through one proxy that maps claude-sonnet-4-6 to Anthropic, gpt-4o-mini to OpenAI, etc., with per-team quotas.
Provider-agnostic embeddings: OpenAIEmbeddings, VoyageEmbeddings, HuggingFaceEmbeddings, BedrockEmbeddings all share the Embeddings interface — swap freely. Just re-index your vector store when changing models (dimensions differ).
See also#
- AI: LangChain — LCEL chains, agents, RAG patterns, streaming
- Packages: pip-langsmith — observability SDK from the same vendor
- Concept: agents — agent loop fundamentals
- Concept: rag — retrieval-augmented generation
- Concept: api — client-library design patterns