Main Client: `client/main.py`

Imports

The imports fall into three groups: standard library and async utilities, the OpenAI client, and FastMCP classes alongside our local handler modules.

# main.py
import asyncio
import json
import logging
import os

from dotenv import load_dotenv
from openai import OpenAI

from fastmcp import Client

from handlers import elicitation_handler
from handlers import log_handler
from handlers import progress_handler
from handlers import sampling_handler

asyncio is the standard-library event loop that drives all the async calls in the client. json is used to decode the argument payloads that OpenAI sends back when it decides to call a tool.
logging controls how much diagnostic output the client prints.
The four from handlers import ... lines pull in the callback functions we register on the Client so that server-initiated events — asking the user a question, requesting an LLM completion, sending a log line, reporting progress — are routed to our own code rather than silently ignored.

Configuration

With all imports in place, we load the .env file and read every configurable value the client needs.

load_dotenv()

MCP_SERVER_URL = os.getenv("MCP_SERVER_URL", "http://localhost:8000/mcp")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
MODEL = os.getenv("MODEL", "gpt-5-mini")

DEBUG_MODE = os.getenv("DEBUG_MODE", "false").lower() == "true"
logging.basicConfig(
    level=logging.DEBUG if DEBUG_MODE else logging.WARNING,
    format="%(levelname)s - %(message)s",
)

MAX_HISTORY = int(os.getenv("MAX_HISTORY", "10"))

All configuration comes from environment variables so no credentials are hard-coded in the source.

MCP_SERVER_URL points at the HTTP endpoint the server is exposing.
OPENAI_API_KEY is required for the sampling handler to call OpenAI's API.
MODEL names the OpenAI model that will interpret user questions and decide which tools to call.
MAX_HISTORY caps how many messages we keep in the conversation buffer; once the limit is hit, history is trimmed to avoid ballooning token costs on long sessions.
The logging level switches between DEBUG and WARNING based on DEBUG_MODE, which means debug output is opt-in rather than always-on.

Creating the MCP Client

We create a single Client instance at module level, wiring in all four handler callbacks so the client knows how to respond to server-initiated events.

mcp_client = Client(
    MCP_SERVER_URL,
    elicitation_handler=elicitation_handler,
    sampling_handler=sampling_handler,
    log_handler=log_handler,
    progress_handler=progress_handler,
)

Client is the FastMCP class that manages the HTTP connection to the server. We pass the server URL as the first argument and then wire up all four handler callbacks. Each handler corresponds to a type of server-initiated request.

elicitation_handler is called when the server needs the user to make a choice — for example, selecting which of several matching movies they meant.
sampling_handler is called when the server wants an LLM to generate text, such as the summarize_movie tool asking for a synopsis.
log_handler receives structured log messages the server emits.
progress_handler receives progress notifications so the client can display a progress indicator during long-running operations like get_top_movies.

We define this at module level (not inside a function) so it is shared across the whole session. The actual HTTP connection is only opened when we enter async with mcp_client: later in the REPL.

Tool Discovery: `get_tools_for_openai()`

OpenAI's chat completions API and the MCP protocol both describe tools, but they use different schemas. This function bridges that gap.

async def get_tools_for_openai(client: Client) -> list:
    """Fetch tools from MCP server and convert to OpenAI format."""
    mcp_tools = await client.list_tools()

    openai_tools = []
    for tool in mcp_tools:
        openai_tools.append(
            {
                "type": "function",
                "function": {
                    "name": tool.name,
                    "description": tool.description,
                    "parameters": tool.inputSchema,
                },
            }
        )

    return openai_tools

client.list_tools() sends a tools/list request to the server and returns a list of MCP tool objects. Each one has a name, a description, and an inputSchema — which is already a JSON Schema object describing the tool's parameters.

Reminder: OpenAI expects tools in the shape {type: "function", function: {name, description, parameters}}. The conversion is a simple reshape: the MCP inputSchema maps directly to OpenAI's parameters field without any further transformation.

We call this once at startup and pass the result into every chat() call rather than fetching it on every turn. This avoids a round-trip to the server on every user message and keeps the tool list stable for the duration of the session.

Let's move to chat() next.

The Agentic Loop: `chat()`

This is the core of the client. It implements the tool-calling loop that lets an LLM progressively call tools until it has enough information to answer the user's question.

The function starts by appending the user's message to the conversation history:

async def chat(
    user_question: str, openai_client: OpenAI, tools: list, messages: list
) -> str:
    """Handle one conversation turn with OpenAI and MCP tools."""
    # Add user's question to the conversation
    messages.append({"role": "user", "content": user_question})

messages is passed by reference and shared across calls, so the model always has the full history of the conversation. This allows it to handle follow-up questions like add the first one to my favourites correctly after a previous answer listed several movies.

Next comes the loop that drives tool use:

    while True:
        response = openai_client.chat.completions.create(
            model=MODEL,
            messages=messages,
            tools=tools,
            tool_choice="auto",
        )

        assistant_message = response.choices[0].message
        messages.append(assistant_message)

        if not assistant_message.tool_calls: