Sampling (Client Side)

A common MCP pattern is for the server to not run an LLM itself. Instead, it asks the host application (the client) to perform a model generation on its behalf. This is what sampling enables.

Sampling is delegation. The server sends a request for a completion, and the client decides whether to allow it, which model to use, and what permissions or user approvals apply.

For example, a server might analyze financial data and compute trends. To produce a human-readable summary, it requests sampling with messages like: Summarize these results for a human reader. The host runs the LLM and returns the generated text.

This keeps servers model-independent. They can leverage AI capabilities without embedding a model SDK or managing model API keys, while the client remains in control of model access and policy.

Let's take a second example to illustrate sampling: Imagine a server connected to a project management system like Jira. It can read tickets, deadlines, and status updates. After gathering all open tasks for the week, it wants to produce a concise executive summary.

Instead of generating the summary itself, the server sends a sampling request to the host with structured information about the tasks.

Here are your tasks for the week:
- Task 1: Fix login bug (due Tuesday)
- Task 2: Prepare presentation for client (due Thursday)
- Task 3: Update documentation (due Friday)
Summarize these tasks in a brief report for my manager.

The host runs the LLM and returns a short report such as: This week focuses on three main tasks: fixing a critical login bug, preparing a client presentation, and updating documentation. The login bug is the highest priority with a Tuesday deadline.

The MCP server handles data retrieval and computation. The host handles language generation. Sampling bridges the two from the client side.

Elicitation (Client Side)

Elicitation is how an MCP server can pause and ask the user for something it cannot safely guess.

Sometimes the server is missing a key detail. Sometimes it is about to do something risky. In both cases, elicitation gives control back to the user instead of letting the AI improvise.

Practical MCP with FastMCP & LangChain

Engineering the Agentic Experience

Enroll now to unlock current content and receive all future updates for free. Your purchase supports the author and fuels the creation of more exciting content. Act fast, as the price will rise as the course nears completion!

Unlock now $26.99 Learn More

Previous Next