Feedback

Chat Icon

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Keep-Alive and Memory Control
48%

Setting Keep-Alive Globally

If you're debugging, you can set the environment variable before starting ollama serve:

# Stop the server first
systemctl stop ollama

# Keep models loaded for 30 minutes after last use
OLLAMA_KEEP_ALIVE=30m ollama serve

# Keep models loaded indefinitely until explicitly stopped
OLLAMA_KEEP_ALIVE=-1 ollama serve

# Unload immediately after every request (no caching)
OLLAMA_KEEP_ALIVE=0 ollama serve

The accepted formats are the duration string like 30s, 5m, 2h, the special value -1 for forever, or 0 for immediate unload.

If Ollama runs as a systemd service (which is the recommended way), edit the unit file and add the variable under [Service].

Start by running the following command to edit the unit file:

systemctl edit ollama

Update the [Service] section with the variable:

[Service]
Environment="OLLAMA_KEEP_ALIVE=30m"

Local AI Engineering with Ollama

Run, understand, customize, fine-tune, and build agentic apps on your own hardware

Enroll now to unlock all content and receive all future updates for free.