Integrating Agents with MCP: Introduction to LangChain
Building a LangChain Agent
Now that we understand the pieces, let's build a working agent from scratch. We will start with the simplest possible version — a conversational agent that remembers what you said earlier in the session — and verify that it runs before we connect it to MCP in the next section.
Step 1: Create the Project
Start by creating a dedicated directory for this project and initialising a Python environment with uv.
mkdir -p $HOME/workspace/langchain/langchain_agent
cd $HOME/workspace/langchain/langchain_agent
uv init --bare --python 3.12
The --bare flag tells uv to create just the environment without scaffolding extra files.
Step 2: Install Dependencies
We need four packages. langchain gives us the agent API. langchain-openai is the connector that lets LangChain talk to OpenAI models. langgraph powers the underlying agent runtime and the in-memory checkpointer we will use for conversation history. python-dotenv loads environment variables from a .env file so we never hard-code API keys.
uv add \
"langchain==1.2.10" \
"langchain-openai==1.1.10" \
"langgraph==1.0.9" \
"python-dotenv==1.2.1"
Step 3: Add Your API Key
Create a .env file in the project directory and paste in your OpenAI API key. The agent will not start without it.
cat > $HOME/workspace/langchain/langchain_agent/.env <
OPENAI_API_KEY=""
EOL
Step 4: Write the Agent
Create a new file called agent.py. Rather than presenting the whole file at once, we will build it up piece by piece so every decision is clear. The final, complete file is shown at the end of this section.
Imports
Open agent.py and start with the imports.
import os
from dotenv import load_dotenv
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import MemorySaver
create_agentis LangChain's high-level factory that builds a full ReAct-style agent loop for us.SummarizationMiddlewareis a built-in middleware component that automatically compresses old conversation history when it gets too long — we will configure it in a moment.MemorySaveris a LangGraph checkpointer that stores every message in memory so the agent can recall earlier parts of the conversation.
Load Configuration
Below the imports, load your environment variables and read the model name.
# Load OPENAI_API_KEY from the .env file into the process environment.
# The langchain-openai connector picks it up automatically from there.
load_dotenv()
# Allow the model to be overridden at runtime via an environment variable,
# which is useful for testing different models without editing source code.
LLM = os.getenv("LLM", "gpt-4o-mini")
load_dotenv() reads the .env file we created in Step 3 and writes each key into os.environ. We never need to pass the API key around manually — the LangChain connector will find it on its own.
Create the Agent with Memory
Now create the agent. This is the minimal version — memory only, no middleware yet.
agent = create_agent(
f"openai:{LLM}",
# MemorySaver persists the full message history across turns. Every call
# to agent.invoke() with the same thread_id reads and updates this history.
checkpointer=MemorySaver(),
)
create_agent accepts a model identifier in the format "provider:model-name". The checkpointer argument wires in the persistent memory store. With just these two lines, the agent already remembers the full conversation for as long as the process is running.
Add Summarization Middleware
Long conversations accumulate tokens quickly and can exceed the model's context window. The SummarizationMiddleware solves this by watching the running token count and, once a threshold is crossed, replacing older messages with a compact summary while keeping the most recent exchanges verbatim.
Add the middleware argument to your create_agent call:
agent = create_agent(
f"openai:{LLM}",
middleware=[
# When the conversation history exceeds 1000 tokens, the middleware
# calls the model to produce a summary of the older messages, then
# replaces those messages with that summary. The newest messages are
# left untouched so context is never lost abruptly.
SummarizationMiddleware(
model=f"openai:{LLM}",
trigger=("tokens", 1000),
)
],
checkpointer=MemorySaver(),
)
The trigger=("tokens", 1000) argument tells the middleware to activate once the accumulated history crosses 1 000 tokens. We pass the same model we use for chatting so the summary style is consistent, but in a production system you might use a cheaper, faster model here to reduce costs.
Write the Conversation Loop
Finally, add the main() function that drives the interactive session.
def main() -> None:
print("Agent ready. Type 'exit' or 'quit' to stop.\n")
while True:
try:
user_input = input("You: ").strip()
except (EOFError, KeyboardInterrupt):
# Ctrl+C or a closed stdin pipe — exit gracefully.
print("\nGoodbye!")
break
if not user_input:
continue
if user_input.lower() in {"exit", "quit"}:
print("Goodbye!")
break
# Run one full turn of the ReAct loop. Passing the same thread_id
# every call is what links all turns into a single conversation.
response = agent.invoke(
{"messages": [{"role": "user", "content"Practical MCP with FastMCP & LangChain
Engineering the Agentic ExperienceEnroll now to unlock current content and receive all future updates for free. Your purchase supports the author and fuels the creation of more exciting content. Act fast, as the price will rise as the course nears completion!
