Runnables & LCEL¶
1. Why this matters¶
LangChain has hundreds of components. Without a unified interface, you'd need to remember a different method name for each: model.complete(), parser.parse(), retriever.get_documents(), prompt.format()...
Runnable solves that — one interface, every component, every method (.invoke, .stream, .batch). That's why prompt | model | parser just works: all three implement the same protocol.
You'll touch Runnables directly when you need to wrap custom Python functions, run things in parallel, add retries/fallbacks, or pass data sideways through a chain.
2. Mental model¶
Think of a Runnable as a typed function with four superpowers:
| Method | What it does | When to use |
|---|---|---|
.invoke(input) |
Call once, return result | Default |
.stream(input) |
Yield chunks as they arrive | Chat UIs |
.batch([in1, in2, ...]) |
Run many inputs in parallel | Bulk processing |
.ainvoke / .astream / .abatch |
Async variants | Inside FastAPI / async apps |
LCEL is just the composition layer: how to wire many Runnables together.
flowchart LR
subgraph SG1 [Every Runnable]
I[Input] --> R[Runnable<br/>.invoke .stream .batch<br/>.ainvoke .astream .abatch]
R --> O[Output]
end
3. Architecture / Flow¶
LCEL composition primitives:
flowchart TB
subgraph SG1 [RunnableSequence a / b / c]
A1[a] --> B1[b] --> C1[c]
end
subgraph SG2 [RunnableParallel x: a, y: b]
I[input] --> A2[a]
I --> B2[b]
A2 --> M[x, y dict]
B2 --> M
end
subgraph SG3 [RunnableLambda fn]
I3[input] --> F[fn] --> O3[fn output]
end
subgraph SG4 [RunnablePassthrough]
I4[input] --> O4[input unchanged]
end
subgraph SG5 [RunnableBranch]
I5[input] --> R{condition}
R -->|true| Y[then]
R -->|false| N[else]
end
4. Core concepts¶
Runnable[Input, Output]— the base class. Every component subclasses this..invoke(input)— synchronous, single call. Returns the typed Output..stream(input)— yields partial chunks (AIMessageChunks for models, strings for parsers)..batch(inputs)— runs many inputs concurrently with a thread/async pool.max_concurrencycontrols parallelism.a | b— composition. Equivalent toRunnableSequence(a, b). Auto-extended fora | b | c.RunnableLambda(fn)— wrap any Python callable. Use it to insert logic mid-chain.RunnablePassthrough()— identity. Carries the input forward unchanged.RunnablePassthrough.assign(k=fn)— passes input through AND addsk=fn(input)to it..with_retry(),.with_fallbacks(),.with_config(),.bind()— wrappers that return modified Runnables.- Type coercion — a plain
dictliteral in a chain auto-converts toRunnableParallel; a plain function auto-converts toRunnableLambda.
5. Code — minimal working example¶
from langchain_core.runnables import RunnableLambda
# Any function becomes a Runnable
square = RunnableLambda(lambda x: x * x)
print(square.invoke(5)) # 25
print(square.batch([1,2,3])) # [1, 4, 9]
# Compose with `|`
add_one = RunnableLambda(lambda x: x + 1)
chain = square | add_one
print(chain.invoke(3)) # 10 (3² + 1)
Stream a model:
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-mini")
for chunk in model.stream("Count to 5 slowly."):
print(chunk.content, end="", flush=True)
6. Code — real-world pattern¶
A RAG-shaped chain using all the primitives — note how the dict literal becomes parallel branches automatically:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_openai import ChatOpenAI
retriever = ... # any retriever (next chapters)
model = ChatOpenAI(model="gpt-4o-mini")
parser = StrOutputParser()
prompt = ChatPromptTemplate.from_template(
"Answer using context.\n\nContext: {context}\nQ: {question}"
)
def format_docs(docs):
return "\n\n".join(d.page_content for d in docs)
# Dict literal auto-becomes RunnableParallel
# Plain function auto-becomes RunnableLambda
chain = (
{
"context": retriever | format_docs,
"question": RunnablePassthrough(),
}
| prompt
| model
| parser
)
print(chain.invoke("What is our refund window?"))
Add resilience without changing the chain shape:
robust_chain = chain.with_retry(
stop_after_attempt=3,
wait_exponential_jitter=True,
).with_fallbacks([
# If main chain fails, try a cheaper model
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| ChatOpenAI(model="gpt-4o-mini-2024-07-18")
| parser
])
Use RunnablePassthrough.assign to enrich the dict mid-stream:
chain = (
RunnablePassthrough.assign(
word_count=lambda x: len(x["text"].split()),
upper=lambda x: x["text"].upper(),
)
| (lambda x: f"{x['upper']} ({x['word_count']} words)")
)
print(chain.invoke({"text": "hello world"})) # "HELLO WORLD (2 words)"
7. Common pitfalls¶
- ❗ Forgetting input-type compatibility. If chain step N expects a dict but step N-1 outputs a string, you must adapt with a
RunnableLambda(lambda s: {"key": s}). - ❗ Using
.invoke()inside another.invoke()instead of composing. Composing is the point — manual nesting kills streaming and tracing. - ❗ Mixing sync and async carelessly. If your outer chain is
.ainvoked, every step must support async. Most do, but customRunnableLambda(sync_fn)will block the event loop — pass an async fn instead. - ❗ Branches sharing references.
RunnableParallelruns branches concurrently — never have them mutate a shared list/dict. - ❗ Big anonymous lambdas in production chains. They're hard to trace and debug. Use named functions for anything > 1 line.
8. When to use vs not use¶
| Pattern | When |
|---|---|
Pipe a | b | c |
Default for any sequence |
RunnableLambda |
Need custom Python logic in the middle of a chain |
RunnableParallel (or dict literal) |
Independent computations on the same input |
RunnablePassthrough.assign(...) |
Adding computed fields to an input dict |
.with_retry(...) |
Flaky API or model |
.with_fallbacks([...]) |
Provider outage tolerance, model A/B |
| Raw Runnable subclass | Building reusable components for a library/SDK |
9. Cheatsheet¶
from langchain_core.runnables import (
Runnable,
RunnableSequence,
RunnableParallel,
RunnableLambda,
RunnablePassthrough,
RunnableBranch,
RunnableConfig,
)
# Invoke styles
r.invoke(x) # one input
r.batch([x1, x2]) # many inputs concurrently
r.batch(xs, config={"max_concurrency": 10})
list(r.stream(x)) # chunks
await r.ainvoke(x) # async
# Wrap fn → Runnable
RunnableLambda(my_fn)
# Or just put `my_fn` directly in `|` — auto-coerced
# Parallel
RunnableParallel({"a": chainA, "b": chainB})
{"a": chainA, "b": chainB} # auto-coerced in a pipe
# Branching
RunnableBranch(
(lambda x: x["t"] == "a", chainA),
(lambda x: x["t"] == "b", chainB),
default_chain, # last arg
)
# Pass-through with enrichment
RunnablePassthrough() # input → input
RunnablePassthrough.assign(k=fn) # input → {**input, "k": fn(input)}
RunnablePassthrough.assign(k=chain) # same, fn can be another Runnable
# Modifiers
r.with_retry(stop_after_attempt=3)
r.with_fallbacks([backup_chain])
r.with_config(tags=["prod"], metadata={"user": uid})
r.bind(stop=["\n\n"]) # bind partial args
# Inspect
chain.get_graph().print_ascii()
chain.input_schema.schema() # Pydantic schema of the input
10. Q&A — recall test¶
-
Q: What does it mean that "everything is a Runnable"? A: Every LangChain component (model, prompt, parser, retriever, tool) implements
Runnable[Input, Output]— same.invoke / .stream / .batch / asyncinterface. That's whya | b | cworks regardless of the specific components. -
Q: What does
a | bcompile to under the hood? A:RunnableSequence(a, b). The|is just Python's__or__operator overloaded on Runnables. -
Q: Difference between
RunnablePassthrough()andRunnablePassthrough.assign(k=fn)? A: PlainRunnablePassthroughreturns the input unchanged..assign(k=fn)returns the input with an extra keykadded. -
Q: Why does
{"a": chainA, "b": chainB}work in a chain? A: LCEL auto-coerces a plain dict toRunnableParallel(steps={"a": chainA, "b": chainB}). Same for functions →RunnableLambda. -
Q: How do you make a chain survive transient API failures? A:
chain.with_retry(stop_after_attempt=3, wait_exponential_jitter=True). Combine with.with_fallbacks([backup])for total provider outages. -
Q: What's the difference between
.stream()and.batch()? A:.stream(input)yields chunks of ONE response as it arrives (for chat UIs)..batch(inputs)runs MANY inputs concurrently and returns all final results.
Practice¶
What does this print?
Expected: True
Use batch (concurrent) instead of a Python loop for multiple invocations
Expected: True
Quiz — Quick check¶
What you remember
Q1. What's the difference between invoke and batch?
-
invokeprocesses one input synchronously;batchprocesses multiple concurrently - No difference
-
batchis for async only -
invokeis deprecated
Why: LLM calls are I/O-bound. Running them concurrently with
batchcan be 10× faster than serialinvokefor the same total work.
Q2. When should you use stream vs invoke?
-
streamfor UIs where you want tokens to appear progressively;invokefor backend/automation where only the final answer matters -
streamis faster -
invokeis for very short responses - They're identical
Why: Streaming improves perceived latency in chat UIs. For non-interactive backend code, just
invokeand process the complete response.
Q3. What does RunnableLambda do?
- Wraps any Python function so it can be piped into an LCEL chain
- A lazy evaluation primitive
- Required for streaming
- Same as
lambdain Python
Why: Lets you insert arbitrary logic into chains. Useful for preprocessing input, postprocessing output, or branching on conditions.
Common doubts¶
What's the difference between LCEL and LangGraph?
LCEL is linear (or branched DAGs). LangGraph is for stateful, cyclic workflows — loops, retries, conditional routing, human-in-the-loop. Use LCEL for simple Q&A and RAG; switch to LangGraph when the flow has loops or branches.
Why do all my LCEL chains start with a prompt?
Most chains start by converting raw input into a prompt for the LLM. But they don't have to — you can start with a retriever (retriever | prompt | model) or any other Runnable. The pattern is just very common.
How do I add error handling to a chain?
Use chain.with_fallbacks([backup_chain]) — runs backup if the main chain fails. Or wrap in a RunnableLambda that catches the exception and returns a default. Or implement structured retries with LangGraph.