So I have had FOMO on claudecode, but I refuse to give them my prompts or pay $100-$200 a month. So 2 days ago, I saw that moonshot provides an anthropic API to kimi k2 so folks could use it with claude code. Well, many folks are already doing that with local. So if you don't know, now you know. This is how I did it in Linux, should be easy to replicate in OSX or Windows with WSL.
Start your local LLM API
Install claude code
install a proxy - https://github.com/1rgs/claude-code-proxy
Edit the server.py proxy and point it to your OpenAI endpoint, could be llama.cpp, ollama, vllm, whatever you are running.
Add the line above load_dotenv
+litellm.api_base = "http://yokujin:8083/v1" # use your localhost name/IP/ports
Start the proxy according to the docs which will run it in localhost:8082
export ANTHROPIC_BASE_URL=http://localhost:8082
export ANTHROPIC_AUTH_TOKEN="sk-localkey"
run claude code
I just created my first code then decided to post this. I'm running the latest mistral-small-24b on that host. I'm going to be driving it with various models, gemma3-27b, qwen3-32b/235b, deepseekv3 etc
Does Claude Code work with the Claude desktop app?
Is Claude Code secure?
Which models does Claude Code use?
Videos
[Claude Code] Tutorial for running Claude Code with a local model
Running Claude Code with a local model or groq.
Running Claude Code with a local model or groq. : ClaudeAI
Use claudecode with local models
I've been absolutely amazed by Claude Code, it's like travelling to the future.
But the price is insane, their claim of $100/day is not a lie, once you get going, the price can be crazy.
Has anyone figured out a way to get it to talk to a local model (and which would work well), or with the Groq API?
I tried searching Reddit and Google, and asking Perplexity, and asking OAI Deep Research, and so far nothing, so I don't hold much hope, but asking just in case.
Thanks!