What's the Point of Local AI?

Should You Run AI Locally, or Just Use an API?

You have two ways to put an AI model to work. Run it on your own machine with a tool like Ollama, or send your requests to a service over the internet (OpenAI, Anthropic, Google, and others).

The honest starting point is the following: the model you can run at home is good, but it isn't as smart as the biggest models, because it's smaller. A home-sized model handles everyday work well, like summarizing, drafting, simple coding, and answering questions. On hard, multi-step problems the biggest models pull ahead, and the harder the task, the wider the gap. A faster computer doesn't close it, because the limit is the size of the model, not the speed of the machine.

The gap is about size, not about who made the model. The best freely available open models now match or beat the best paid, closed ones on many coding tests. But those open champions are huge, far too big for a home GPU. To run one you rent a large cloud machine or pay a service per request to host it, which costs about the same as a closed paid API. So the real choice isn't open versus closed. It's a smaller model on your own machine versus a bigger model over the internet, whoever built it.

That splits into 2 questions, in order.

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Enroll now to unlock all content and receive all future updates for free.

Unlock now $26.99 Learn More

Previous Next