Claude Code without the cloud: Ollama adds Anthropic Messages API compatibility
If you’ve been enjoying Claude Code, you probably also had that tiny voice in the back of your head:
- “This is amazing… but tokens aren’t free.”
- “Also… my code is definitely leaving my laptop.”
Well, Ollama just shipped a very “hold my coffee” update.
The change
Ollama now supports the Anthropic Messages API interface.
Geek translation: tools that expect to talk to Anthropic-style Messages endpoints can point at Ollama instead. That includes the Claude Code workflow and its whole agent-y toolbelt.
So you keep the experience (agent loops, tool use, terminal commands, coding flows)… but you can swap the engine.
Why this is a big deal
Because now the “brain” behind Claude Code can be:
- local open-weight models (Llama, Mistral, Qwen, etc.)
- running on your GPU/CPU
- with no per-token bill for local runs
- and way better privacy posture (your repo doesn’t have to be shipped to a cloud API)
It’s basically: agentic UX + your hardware = “my code stays here” energy.
Quick start (local, simple)
1) Pull a coding-capable model
Pick whatever you like. Example:
ollama pull qwen3-coder
# or a different model you trust for coding
2) Point Claude Code at Ollama’s endpoint
This is the common pattern: set an Anthropic-style base URL to your local Ollama server.
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434
3) Run Claude Code with a local model
claude --model qwen3-coder
That’s it. Same workflow. Different brain.
Tip: Agentic tools love context. If your model supports larger context windows, it usually feels less “goldfish memory.”
Ops/SRE angle (because we’re adults now)
If you’re thinking “cool demo” — sure.
But the real win is operational:
- Data residency / compliance: you can keep sensitive code and logs on-prem or on-device.
- Cost control: choose the cheapest model that does the job (and upgrade only when needed).
- Standardize dev UX: keep one workflow across the team while letting infra decide where the model runs.
Reality check (no hype)
- Local models vary. Some are fantastic for day-to-day coding and refactors; some will hallucinate like it’s their job.
- Tool access is powerful. If your agent can run terminal commands, treat it like automation:
- least privilege
- guardrails
- review changes (especially in prod repos)
Bottom line
This is the combo a lot of teams have wanted: Claude Code-style agent workflows paired with local, private, open-weight models.
If you’ve been waiting for “agentic coding, but keep the repo inside the org” — this is very close to that moment.
Sources
- Ollama blog post: https://ollama.com/blog/claude
- Ollama docs: Claude Code integration: https://docs.ollama.com/integrations/claude-code
- Ollama docs: Anthropic compatibility: https://docs.ollama.com/api/anthropic-compatibility