12%

What Is Ollama?

Ollama is an open source tool that lets you run large language models on your own machine. You give it a model name, it downloads the weights (the result of training), manages memory and GPU, and exposes a local HTTP API plus a CLI. From the outside, it looks and feels like Docker for language models. That comparison is not accidental, and we will come back to why.

Ollama was founded in 2021 by Jeffrey Morgan and Michael Chiang, based in Palo Alto, and went through Y Combinator's Winter 2021 batch. The first public release on GitHub came in July 2023, once Meta's Llama models and the GGUF format made local inference genuinely viable on consumer hardware.

(i) GGUF is the file format used to store language models for local inference. It packages everything needed to run the model into a single self-contained file.

The founders' background matters, because it shows up in the design. Morgan and Chiang previously worked on Kitematic, an early graphical interface for Docker that Docker eventually acquired and folded into what became Docker Desktop. If you have ever wondered why Ollama uses verbs like pull

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Enroll now to unlock all content and receive all future updates for free.

Unlock now $26.99 Learn More

Previous Next