What Middleware Is Good For

FastMCP ships with production-ready implementations of the most common patterns. Here is a quick example of each.

Logging — record every request and response, optionally including the full payload:

from fastmcp import FastMCP
from fastmcp.server.middleware.logging import LoggingMiddleware

mcp = FastMCP("My Server")
mcp.add_middleware(LoggingMiddleware(include_payloads=True, max_payload_length=1000))

Timing — measure how long each operation takes:

from fastmcp.server.middleware.timing import TimingMiddleware

mcp.add_middleware(TimingMiddleware())

Rate limiting — reject requests that arrive too fast using a token-bucket algorithm:

from fastmcp.server.middleware.rate_limiting import RateLimitingMiddleware

mcp.add_middleware(RateLimitingMiddleware(max_requests_per_second=10.0, burst_capacity=20))

Caching — return a stored response rather than running the tool again, with per-operation TTL control:

from fastmcp.server.middleware.caching import ResponseCachingMiddleware, CallToolSettings

mcp.add_middleware(ResponseCachingMiddleware(
    call_tool_settings=CallToolSettings(ttl=60, included_tools=["expensive_query"])
))

Response size limiting — truncate oversized tool outputs before they overwhelm an LLM context window:

from

Practical MCP with FastMCP & LangChain

Engineering the Agentic Experience

Enroll now to unlock current content and receive all future updates for free. Your purchase supports the author and fuels the creation of more exciting content. Act fast, as the price will rise as the course nears completion!

Unlock now $26.99 Learn More

Previous Next