What Is Ollama?
14%
What Ollama Is, and What It Is Not
This is the part that confuses newcomers, so it is worth pinning down at the start of this book. Local AI is not one tool, it is a layered stack, and Ollama only occupies one of those layers.
The following table shows these different layers, examples of what they do, and where they fit in the stack.
| Layer | What it does | Examples |
|---|---|---|
| Model weights | What the model learned, stored as a file | Llama, Qwen, Gemma, Mistral (distributed as GGUF, safetensors, etc.) |
| Inference engine | Does the actual math on CPU or GPU | llama.cpp, MLX |
| Runtime / server | Manages models, exposes APIs, handles loading and unloading | Ollama, LM Studio, llama-server, vLLM |
| Client / interface |
Local AI Engineering with Ollama
Run, understand, customize, fine-tune, and build agentic apps on your own hardwareEnroll now to unlock all content and receive all future updates for free.
