FastMCP Middleware
What Middleware Is Good For
FastMCP ships with production-ready implementations of the most common patterns. Here is a quick example of each.
Logging — record every request and response, optionally including the full payload:
from fastmcp import FastMCP
from fastmcp.server.middleware.logging import LoggingMiddleware
mcp = FastMCP("My Server")
mcp.add_middleware(LoggingMiddleware(include_payloads=True, max_payload_length=1000))
Timing — measure how long each operation takes:
from fastmcp.server.middleware.timing import TimingMiddleware
mcp.add_middleware(TimingMiddleware())
Rate limiting — reject requests that arrive too fast using a token-bucket algorithm:
from fastmcp.server.middleware.rate_limiting import RateLimitingMiddleware
mcp.add_middleware(RateLimitingMiddleware(max_requests_per_second=10.0, burst_capacity=20))
Caching — return a stored response rather than running the tool again, with per-operation TTL control:
from fastmcp.server.middleware.caching import ResponseCachingMiddleware, CallToolSettings
mcp.add_middleware(ResponseCachingMiddleware(
call_tool_settings=CallToolSettings(ttl=60, included_tools=["expensive_query"])
))
Response size limiting — truncate oversized tool outputs before they overwhelm an LLM context window:
fromPractical MCP with FastMCP & LangChain
Engineering the Agentic ExperienceEnroll now to unlock current content and receive all future updates for free. Your purchase supports the author and fuels the creation of more exciting content. Act fast, as the price will rise as the course nears completion!
