claude code opus vs sonnet

Are you seeing big difference between Sonnet vs. Opus?

reddit.com › r › ClaudeAI › comments › 1lhu90f › are_you_seeing_big_difference_between_sonnet_vs

Opus follows instructions better than sonnet for big tasks, but it can sometimes overthink or overengineer something. Sonnet tends to be more direct but can sometimes not follow instructions as good as opus. I'd say opus is great at planning and executing big tasks, while sonnet is best for simpler tasks that dont need the intelligence of opus.or at least plan with opus, code with sonnet, is another way. Answer from Hauven on reddit.com

reddit.com › r/claudeai › are you seeing big difference between sonnet vs. opus?

r/ClaudeAI on Reddit: Are you seeing big difference between Sonnet vs. Opus?

June 22, 2025 -

I’m on the $100/month plan. 1-2 prompts in I got my limit on Opus, then I spend most of my coding day on Sonnet.

Whenever I am on Opus, it isn’t obvious it’s writing code that Sonnet can’t. I see a bigger difference between prompts that do vs. do not have “ultrathink” rather than Sonnet/Opus.

Does anyone with more experience have a clear perspective on Sonnet vs Opus? Even on the benchmarks they are about the same.

Top answer

1 of 17

25

Opus follows instructions better than sonnet for big tasks, but it can sometimes overthink or overengineer something. Sonnet tends to be more direct but can sometimes not follow instructions as good as opus. I'd say opus is great at planning and executing big tasks, while sonnet is best for simpler tasks that dont need the intelligence of opus.or at least plan with opus, code with sonnet, is another way.

2 of 17

14

Opus is absolutely worth it for me and a large codebase at least. It's not just the line of code being different, it's the entire logic process. Basically, it's much more efficient which saves me time.

Anthropic

anthropic.com › news › claude-4

Introducing Claude 4

These models advance our customers' AI strategies across the board: Opus 4 pushes boundaries in coding, research, writing, and scientific discovery, while Sonnet 4 brings frontier performance to everyday use cases as an instant upgrade from Sonnet 3.7. Claude 4 models lead on SWE-bench Verified, ...

Discussions

Is Opus significantly better than Sonnet for software development?

Here’s my workflow: Custom ChatGPT Claude Prompt Generator using -Anthropic’s prompt engineering documentation in uploaded reference material to craft prompts for Claude from my natural language. GPT generates XML formatted and structured instructions and tasks for Claude to easily digest and provide optimal output. Step 1: Flesh out an idea and ask Opus to create a detailed explanation of the task at hand and propose a potential workflow to build a solution. Step 2: Feed Opus’ idea to my ChatGPT prompt generator and have it produce a prompt in XML format with code snippets as example outputs, roles (you are a senior software dev), and structured tasks and contexts. ChatGPT is surprisingly good at generating Claude XML if you give it the documentation. Step 3: Get Sonnet to generate the initial solution and code with the ChatGPT formatted prompt. Step 4. Feed the Sonnet code back to my ChatGPT Prompt to construct an XML prompt asking Claude to verify the code against the initial Sonnet prompt and review any errors, improvements, inaccuracies or other observations. Step 5: Feed the validation prompt, the initial prompt, and the code into Opus. The XML formatted GPT prompt is actually essential for making sure Opus understands what each file is and what to do with it. Step 6: Use Opus to regenerate certain parts of code or observations for improvement it has made in Sonnets code, with many-shot approach. Step 7: If any issues are not making progress, just fix and touch them up myself. Step 8: Verify the finished code between a Non-custom GPT and Opus simultaneously, multiple times. You’ll know that the models can’t do much more for you when they both start suggesting the same minor improvements. They’ll usually suggest different improvements, which is good. I find that ChatGPT can sometimes spot things Opus can’t, but using that information I can instruct Opus to correct the problem and it does so better than GPT. In summary, GPT and Opus are a strong tag team at planning, small logical revisions and debugging, but you’re wasting tokens using Opus to generate code, and you’re wasting time using GPT to generate code. They also work very well together if you explain that you are using both of them to collaborate on a project, they seem to understand the pitfalls and areas to focus on when they understand the context of being paired with each other in collaboration. For example, for GPT: “You generated this prompt for Claude, and Claude responded with this prompt” Sonnet is quite capable and fast, too. For less complex projects, even Haiku is very reliable. Opus acts as a project director and supervisor. GPT acts as a manager. Sonnet and Haiku act as the developers. I don’t really care what benchmarks say, because the benchmarked GPT models are definitely not what you get with a GPT subscription or API key. Anthropic’s public models seem to be more aligned with their benchmarked models. Perhaps context window is key, or perhaps quality of training data surpasses quantity of training data, and perhaps the benchmarks we have currently are not as applicable for assisting developers who aren’t PhD AI researchers conducting benchmark tests. Claude just has more energy. He’s like that guy who wants to help and puts his hand up to answer questions in class. GPT acts like I’m not paying it enough to be at work. Even if GPT was benchmarked significantly higher than Claude, you’re still going to get more done with the enthusiastic guy. I just wish these AI platforms would start adopting subscription models where you can pay exorbitant fees to avoid getting caught in the hardware with everybody else paying 20 dollars or using their API balance. Finally: To review a completed code base, use greptile. Not cursor, not aids, or whatever else it’s called. Not Codeium. Currently, codebases will fuck with the quality of your output. Multiple files, specifically. It’s worth aggregating everything into one or two files and then modularising it manually later. Greptile is the only platform that can actually productively use an entire code base. I highly suggest using Greptile at all advanced stages in your projects development, as Claude and GPT are not even close to Greptiles ability to contextualise code. Greptile can help generate prompts with contextual reminders. More on reddit.com

r/ClaudeAI

33

52

April 25, 2024

Claude Opus 4 and Claude Sonnet 4 officially released

we’ve significantly reduced behavior where the models use shortcuts or loopholes to complete tasks. Both models are 65% less likely to engage in this behavior than Sonnet 3.7 on agentic tasks that are particularly susceptible to shortcuts and loopholes. This is a very welcome improvement. More on reddit.com

r/ClaudeAI

373

1748

May 22, 2025

Opus 4 vs Sonnet 4

I love reading these. All of you guys have AI with super important and hard stuff, and I'm over here using it to play like, Dungeons and Dragons. More on reddit.com

r/ClaudeAI

73

81

May 26, 2025

Claude Opus 4.1 is now available!

Hey all, Anthropic has just released Claude Opus 4.1, and the model is already available for you in Cursor to try! While its behaviour should be unchanged from 4.0, Anthropic reports a higher intelligence against benchmarks. Note: Claude Opus is a very expensive model to run, and will likely ... More on forum.cursor.com

forum.cursor.com

8

August 5, 2025

Videos