Codex CLI vs Claude Code (adding features to a 500k codebase)
Codex Vs Claude code
Which CLI AI coding tool to use right now? Codex CLI vs. Claude Caude vs. sth else?
A few thoughts on Codex CLI vs. Claude Code
Is Claude Code better than Codex CLI for coding?
Can I use Claude Code and Codex CLI together?
Which is cheaper, Claude Code or Codex CLI?
Videos
I've been testing OpenAI's Codex CLI vs Claude Code in a 500k codebase which has a React Vite frontend and a ASP .NET 9 API, MySQL DB hosted on Azure. My takeaways from my use cases (or watch them from the YT video link in the comments):
- Boy oh boy, Codex CLI has caught up BIG time with GPT5 High Reasoning, I even preferred it to Claude Code in some implementations
- Codex uses GPT 5 MUCH better than in other AI Coding tools like Cursor
- Vid: https://youtu.be/MBhG5__15b0
- Codex was lacking a simple YOLO mode when I tested. You had to acknowledge not running in a sandbox AND allow it to never ask for approvals, which is a bit annoying, but you can just create an alias like codex-yolo for it
- Claude Code actually had more shots (error feedback/turns) than Codex to get things done
- Claude Code still has more useful features, like subagents and hooks. Notifications from Codex are still in a bit of beta
- GPT5 in Codex stops less to ask questions than in other AI tools, it's probably because of the released official GPT5 Prompting Guide by OpenAI
What is your experience with both tools?
For those who have already tested the Codex, what do you think?
I have used mostly Windsurf and Kilo Code to build around 8 projects, the most complicated one is a flutter iOS & Android app with appr. 750 test users using firebase as backend and Gemini Flash 2.5 for AI functionalities.
Now I would like to start learning CLI AI coding tools. 2 months ago the choice would have been an obvious Claude Code (I have the pro subscription), but I've seen the hype around OpenAI's Codex CLI these days.
Would be great to hear from your experience:
-
What is the difference between these 2 right now besides the LLM models?
-
What are the usage limits for a mix of planning / coding / debugging usage? (for Claude Pro and OpenAI Plus sub)
-
Any tipps for switching from editor based coding to terminal based? I am slightly hesitant because I am a visual person and am afraid that I will lose the overview using the terminal. Or do you guys use terminal and editor at the same time?
-
Are there any other options you recommend?
Opus 4.1 is a beast of a coding model, but I'd suggest to any Claude Max user to at least try Codex CLI for a day. It can also use your ChatGPT subscription now and I've been getting a ton of usage out of my Plus tier. Even with Sonnet, Claude Pro would have limited me LONG ago.
A few thoughts:
-
While I still prefer CC + Opus 4.1 overall, I actually prefer the code that Codex CLI + GPT-5 writes. It's closer to the code I'd also write.
-
I've used CC over Bedrock and Vertex for work and the rate limits were getting really ridiculous. Not sure this also happens with the Anthropic API, but it's really refreshing how quick and stable GPT-5 performs over Codex CLI.
-
As of today Claude Code is a much more feature rich and complete tool compared to Codex. I miss quite a few things coming from CC, but core functionality is there and works well.
-
GPT-5 seems to have a very clear edge on debugging.
-
GPT-5 finds errors/bugs while working on something else, which I haven't noticed this strongly with Claude.
-
Codex CLI now also supports MCP, although support for image inputs doesn't seem to work.
-
Codex doesn't ship with fetch or search, so be sure to add those via MCP. I'm using my own
-
If your budget ends at $20 per month, I think ChatGPT might be the best value for your money
What's your experience?
I'm not ready to call Codex a "Claude killer" just yet, but I'm definitely impressed with what I've seen over the past six hours of use.
I'm currently on Anthropic's $200/month plan (Claude's highest tier) and ChatGPT's $20 plus plan. Since this was my first time trying ChatGPT, I started with the Plus tier to get a feel for it. There is also a $200 pro tier available for Chatgpt This past week, Claude has been underperforming significantly, and I'm not alone in noticing this. After seeing many users discuss ChatGPT's coding capabilities, I decided to give Codex a shot, and I was impressed. I had two persistent coding issues that Claude couldn't resolve and ChatGPT fixed both of them easily, in one prompt. There are also a few other things I like about Codex so far. It has Better listening skills. It pays closer attention to my specific requests, it admits mistakes, it collaborates better on troubleshooting by asking clarifying questions about my code, and its response is noticeably quicker than Claude Opus. However, ChatGPT isn't perfect either. I'm currently dealing with a state persistence issue that neither AI has been able to solve. Additionally, since I've only used ChatGPT for six hours, compared to months with Claude, I may have given it tasks it excels at. Bottom line: I'm genuinely impressed with ChatGPT's performance, but I'm not abandoning Claude just yet. However, if you haven't tried ChatGPT for coding, I'd definitely recommend giving it a shot – it performed exceptionally well for my specific use cases. It may be that going forward I use both to finish my projects.
Edit: to install make sure you have node.js installed and your computer then run
npm install -g @openai/codex
You can also install using homebrew by running.
brew install codex
From what I’ve seen so far, Claude Code seems to have the best overall reviews in terms of quality and performance. The main downside for me is that it’s locked behind a company and not open source (I know about the leak, but I’m more interested in something officially open and actively maintained).
Codex, on the other hand, looks really appealing because it’s open source and allows for forks, which gives it a lot more flexibility and long-term potential.
Then there’s OpenCode, probably the most interesting of the three. It has a huge community and a lot of momentum, but I’m not sure if it’s actually on par with the others in real-world use.
Curious to hear your thoughts, how do these compare in practice? Is OpenCode actually competitive, or is it more hype than substance?
Oh and by Claude i'm referring to the open sourced forks that are comming, which we don't know if will be updated or etc, not using the proprietary one ever