devops•1/20/2026

Claude Code without the cloud: Ollama adds Anthropic Messages API compatibility

If you’ve been enjoying Claude Code, you probably also had that tiny voice in the back of your head:

“This is amazing… but tokens aren’t free.”
“Also… my code is definitely leaving my laptop.”

Well, Ollama just shipped a very “hold my coffee” update.

The change

Ollama now supports the Anthropic Messages API interface.

Geek translation: tools that expect to talk to Anthropic-style Messages endpoints can point at Ollama instead. That includes the Claude Code workflow and its whole agent-y toolbelt.

So you keep the experience (agent loops, tool use, terminal commands, coding flows)… but you can swap the engine.

Why this is a big deal

Because now the “brain” behind Claude Code can be:

local open-weight models (Llama, Mistral, Qwen, etc.)
running on your GPU/CPU
with no per-token bill for local runs
and way better privacy posture (your repo doesn’t have to be shipped to a cloud API)

It’s basically: agentic UX + your hardware = “my code stays here” energy.

Quick start (local, simple)

1) Pull a coding-capable model

Pick whatever you like. Example:

ollama pull qwen3-coder
# or a different model you trust for coding

2) Point Claude Code at Ollama’s endpoint

This is the common pattern: set an Anthropic-style base URL to your local Ollama server.

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434

3) Run Claude Code with a local model

claude --model qwen3-coder

That’s it. Same workflow. Different brain.

Tip: Agentic tools love context. If your model supports larger context windows, it usually feels less “goldfish memory.”

Ops/SRE angle (because we’re adults now)

If you’re thinking “cool demo” — sure.
But the real win is operational:

Data residency / compliance: you can keep sensitive code and logs on-prem or on-device.
Cost control: choose the cheapest model that does the job (and upgrade only when needed).
Standardize dev UX: keep one workflow across the team while letting infra decide where the model runs.

Reality check (no hype)

Local models vary. Some are fantastic for day-to-day coding and refactors; some will hallucinate like it’s their job.
Tool access is powerful. If your agent can run terminal commands, treat it like automation:
- least privilege
- guardrails
- review changes (especially in prod repos)

Bottom line

This is the combo a lot of teams have wanted: Claude Code-style agent workflows paired with local, private, open-weight models.

If you’ve been waiting for “agentic coding, but keep the repo inside the org” — this is very close to that moment.

Sources

Ollama blog post: https://ollama.com/blog/claude
Ollama docs: Claude Code integration: https://docs.ollama.com/integrations/claude-code
Ollama docs: Anthropic compatibility: https://docs.ollama.com/api/anthropic-compatibility