Building a LangChain Agent

Now that we understand the pieces, let's build a working agent from scratch. We will start with the simplest possible version — a conversational agent that remembers what you said earlier in the session — and verify that it runs before we connect it to MCP in the next section.

Step 1: Create the Project

Start by creating a dedicated directory for this project and initialising a Python environment with uv.

mkdir -p $HOME/workspace/langchain/langchain_agent
cd $HOME/workspace/langchain/langchain_agent

uv init --bare --python 3.12

The --bare flag tells uv to create just the environment without scaffolding extra files.

Step 2: Install Dependencies

We need four packages. langchain gives us the agent API. langchain-openai is the connector that lets LangChain talk to OpenAI models. langgraph powers the underlying agent runtime and the in-memory checkpointer we will use for conversation history. python-dotenv loads environment variables from a .env file so we never hard-code API keys.

uv add \
    "langchain==1.2.10" \
    "langchain-openai==1.1.10" \
    "langgraph==1.0.9" \
    "python-dotenv==1.2.1"

Step 3: Add Your API Key

Create a .env file in the project directory and paste in your OpenAI API key. The agent will not start without it.

cat > $HOME/workspace/langchain/langchain_agent/.env <
OPENAI_API_KEY=""
EOL

Step 4: Write the Agent

Create a new file called agent.py. Rather than presenting the whole file at once, we will build it up piece by piece so every decision is clear. The final, complete file is shown at the end of this section.

Imports

Open agent.py and start with the imports.

import os

from dotenv import load_dotenv
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
from langgraph.checkpoint.memory import MemorySaver

create_agent is LangChain's high-level factory that builds a full ReAct-style agent loop for us.
SummarizationMiddleware is a built-in middleware component that automatically compresses old conversation history when it gets too long — we will configure it in a moment. MemorySaver is a LangGraph checkpointer that stores every message in memory so the agent can recall earlier parts of the conversation.

Load Configuration

Below the imports, load your environment variables and read the model name.

# Load OPENAI_API_KEY from the .env file into the process environment.
# The langchain-openai connector picks it up automatically from there.
load_dotenv()

# Allow the model to be overridden at runtime via an environment variable,
# which is useful for testing different models without editing source code.
LLM = os.getenv("LLM", "gpt-4o-mini")

load_dotenv() reads the .env file we created in Step 3 and writes each key into os.environ. We never need to pass the API key around manually — the LangChain connector will find it on its own.

Create the Agent with Memory

Now create the agent. This is the minimal version — memory only, no middleware yet.

agent = create_agent(
    f"openai:{LLM}",
    # MemorySaver persists the full message history across turns. Every call
    # to agent.invoke() with the same thread_id reads and updates this history.
    checkpointer=MemorySaver(),
)

create_agent accepts a model identifier in the format "provider:model-name". The checkpointer argument wires in the persistent memory store. With just these two lines, the agent already remembers the full conversation for as long as the process is running.

Add Summarization Middleware

Long conversations accumulate tokens quickly and can exceed the model's context window. The SummarizationMiddleware solves this by watching the running token count and, once a threshold is crossed, replacing older messages with a compact summary while keeping the most recent exchanges verbatim.

Add the middleware argument to your create_agent call:

agent = create_agent(
    f"openai:{LLM}",
    middleware=[
        # When the conversation history exceeds 1000 tokens, the middleware
        # calls the model to produce a summary of the older messages, then
        # replaces those messages with that summary. The newest messages are
        # left untouched so context is never lost abruptly.
        SummarizationMiddleware(
            model=f"openai:{LLM}",
            trigger=("tokens", 1000),
        )
    ],
    checkpointer=MemorySaver(),
)

The trigger=("tokens", 1000) argument tells the middleware to activate once the accumulated history crosses 1 000 tokens. We pass the same model we use for chatting so the summary style is consistent, but in a production system you might use a cheaper, faster model here to reduce costs.

Write the Conversation Loop

Finally, add the main() function that drives the interactive session.

def main() -> None:
    print("Agent ready. Type 'exit' or 'quit' to stop.\n")

    while True:
        try:
            user_input = input("You: ").strip()
        except (EOFError, KeyboardInterrupt):
            # Ctrl+C or a closed stdin pipe — exit gracefully.
            print("\nGoodbye!")
            break

        if not user_input:
            continue

        if user_input.lower() in {"exit", "quit"}:
            print("Goodbye!")
            break

        # Run one full turn of the ReAct loop. Passing the same thread_id
        # every call is what links all turns into a single conversation.
        response = agent.invoke(
            {"messages": [{"role": "user", "content"

Practical MCP with FastMCP & LangChain

Engineering the Agentic Experience

Enroll now to unlock current content and receive all future updates for free. Your purchase supports the author and fuels the creation of more exciting content. Act fast, as the price will rise as the course nears completion!

Unlock now $26.99 Learn More

Previous Next