devops

Claude Code without the cloud: Ollama adds Anthropic Messages API compatibility

If you’ve been enjoying Claude Code, you probably also had that tiny voice in the back of your head:

  • “This is amazing… but tokens aren’t free.”
  • “Also… my code is definitely leaving my laptop.”

Well, Ollama just shipped a very “hold my coffee” update.

The change

Ollama now supports the Anthropic Messages API interface.

Geek translation: tools that expect to talk to Anthropic-style Messages endpoints can point at Ollama instead. That includes the Claude Code workflow and its whole agent-y toolbelt.

So you keep the experience (agent loops, tool use, terminal commands, coding flows)… but you can swap the engine.

Why this is a big deal

Because now the “brain” behind Claude Code can be:

  • local open-weight models (Llama, Mistral, Qwen, etc.)
  • running on your GPU/CPU
  • with no per-token bill for local runs
  • and way better privacy posture (your repo doesn’t have to be shipped to a cloud API)

It’s basically: agentic UX + your hardware = “my code stays here” energy.

Quick start (local, simple)

1) Pull a coding-capable model

Pick whatever you like. Example:

ollama pull qwen3-coder
# or a different model you trust for coding

2) Point Claude Code at Ollama’s endpoint

This is the common pattern: set an Anthropic-style base URL to your local Ollama server.

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434

3) Run Claude Code with a local model

claude --model qwen3-coder

That’s it. Same workflow. Different brain.

Tip: Agentic tools love context. If your model supports larger context windows, it usually feels less “goldfish memory.”

Ops/SRE angle (because we’re adults now)

If you’re thinking “cool demo” — sure.
But the real win is operational:

  • Data residency / compliance: you can keep sensitive code and logs on-prem or on-device.
  • Cost control: choose the cheapest model that does the job (and upgrade only when needed).
  • Standardize dev UX: keep one workflow across the team while letting infra decide where the model runs.

Reality check (no hype)

  • Local models vary. Some are fantastic for day-to-day coding and refactors; some will hallucinate like it’s their job.
  • Tool access is powerful. If your agent can run terminal commands, treat it like automation:
    • least privilege
    • guardrails
    • review changes (especially in prod repos)

Bottom line

This is the combo a lot of teams have wanted: Claude Code-style agent workflows paired with local, private, open-weight models.

If you’ve been waiting for “agentic coding, but keep the repo inside the org” — this is very close to that moment.


Sources