Opus follows instructions better than sonnet for big tasks, but it can sometimes overthink or overengineer something. Sonnet tends to be more direct but can sometimes not follow instructions as good as opus. I'd say opus is great at planning and executing big tasks, while sonnet is best for simpler tasks that dont need the intelligence of opus.or at least plan with opus, code with sonnet, is another way. Answer from Hauven on reddit.com
I’m on the $100/month plan. 1-2 prompts in I got my limit on Opus, then I spend most of my coding day on Sonnet.
Whenever I am on Opus, it isn’t obvious it’s writing code that Sonnet can’t. I see a bigger difference between prompts that do vs. do not have “ultrathink” rather than Sonnet/Opus.
Does anyone with more experience have a clear perspective on Sonnet vs Opus? Even on the benchmarks they are about the same.
These models advance our customers' AI strategies across the board: Opus 4 pushes boundaries in coding, research, writing, and scientific discovery, while Sonnet 4 brings frontier performance to everyday use cases as an instant upgrade from Sonnet 3.7. Claude 4 models lead on SWE-bench Verified, ...
Discussions
Is Opus significantly better than Sonnet for software development?
Here’s my workflow: Custom ChatGPT Claude Prompt Generator using -Anthropic’s prompt engineering documentation in uploaded reference material to craft prompts for Claude from my natural language. GPT generates XML formatted and structured instructions and tasks for Claude to easily digest and provide optimal output. Step 1: Flesh out an idea and ask Opus to create a detailed explanation of the task at hand and propose a potential workflow to build a solution. Step 2: Feed Opus’ idea to my ChatGPT prompt generator and have it produce a prompt in XML format with code snippets as example outputs, roles (you are a senior software dev), and structured tasks and contexts. ChatGPT is surprisingly good at generating Claude XML if you give it the documentation. Step 3: Get Sonnet to generate the initial solution and code with the ChatGPT formatted prompt. Step 4. Feed the Sonnet code back to my ChatGPT Prompt to construct an XML prompt asking Claude to verify the code against the initial Sonnet prompt and review any errors, improvements, inaccuracies or other observations. Step 5: Feed the validation prompt, the initial prompt, and the code into Opus. The XML formatted GPT prompt is actually essential for making sure Opus understands what each file is and what to do with it. Step 6: Use Opus to regenerate certain parts of code or observations for improvement it has made in Sonnets code, with many-shot approach. Step 7: If any issues are not making progress, just fix and touch them up myself. Step 8: Verify the finished code between a Non-custom GPT and Opus simultaneously, multiple times. You’ll know that the models can’t do much more for you when they both start suggesting the same minor improvements. They’ll usually suggest different improvements, which is good. I find that ChatGPT can sometimes spot things Opus can’t, but using that information I can instruct Opus to correct the problem and it does so better than GPT. In summary, GPT and Opus are a strong tag team at planning, small logical revisions and debugging, but you’re wasting tokens using Opus to generate code, and you’re wasting time using GPT to generate code. They also work very well together if you explain that you are using both of them to collaborate on a project, they seem to understand the pitfalls and areas to focus on when they understand the context of being paired with each other in collaboration. For example, for GPT: “You generated this prompt for Claude, and Claude responded with this prompt” Sonnet is quite capable and fast, too. For less complex projects, even Haiku is very reliable. Opus acts as a project director and supervisor. GPT acts as a manager. Sonnet and Haiku act as the developers. I don’t really care what benchmarks say, because the benchmarked GPT models are definitely not what you get with a GPT subscription or API key. Anthropic’s public models seem to be more aligned with their benchmarked models. Perhaps context window is key, or perhaps quality of training data surpasses quantity of training data, and perhaps the benchmarks we have currently are not as applicable for assisting developers who aren’t PhD AI researchers conducting benchmark tests. Claude just has more energy. He’s like that guy who wants to help and puts his hand up to answer questions in class. GPT acts like I’m not paying it enough to be at work. Even if GPT was benchmarked significantly higher than Claude, you’re still going to get more done with the enthusiastic guy. I just wish these AI platforms would start adopting subscription models where you can pay exorbitant fees to avoid getting caught in the hardware with everybody else paying 20 dollars or using their API balance. Finally: To review a completed code base, use greptile. Not cursor, not aids, or whatever else it’s called. Not Codeium. Currently, codebases will fuck with the quality of your output. Multiple files, specifically. It’s worth aggregating everything into one or two files and then modularising it manually later. Greptile is the only platform that can actually productively use an entire code base. I highly suggest using Greptile at all advanced stages in your projects development, as Claude and GPT are not even close to Greptiles ability to contextualise code. Greptile can help generate prompts with contextual reminders. More on reddit.com
r/ClaudeAI
33
52
April 25, 2024
Claude Opus 4 and Claude Sonnet 4 officially released
we’ve significantly reduced behavior where the models use shortcuts or loopholes to complete tasks. Both models are 65% less likely to engage in this behavior than Sonnet 3.7 on agentic tasks that are particularly susceptible to shortcuts and loopholes. This is a very welcome improvement. More on reddit.com
r/ClaudeAI
373
1748
May 22, 2025
Opus 4 vs Sonnet 4
I love reading these. All of you guys have AI with super important and hard stuff, and I'm over here using it to play like, Dungeons and Dragons. More on reddit.com
r/ClaudeAI
73
81
May 26, 2025
Claude Opus 4.1 is now available!
Hey all, Anthropic has just released Claude Opus 4.1, and the model is already available for you in Cursor to try! While its behaviour should be unchanged from 4.0, Anthropic reports a higher intelligence against benchmarks. Note: Claude Opus is a very expensive model to run, and will likely ... More on forum.cursor.com
Claude Opus 4 and Claude Sonnet 4 are both strong performers, but with distinct strengths. Claude Opus 4 outperforms Sonnet 4 in several key areas: it achieves higher accuracy in terminal coding (50.0% vs. 41.3%), high school math competition ...
September 23, 2025 - This isn’t just fast code completion—it’s thoughtful problem-solving over time, showcasing its strength in memory retention and contextual awareness. Widely Integrated Into AI Infrastructure Claude Opus 4 is now available on major cloud platforms: ... These integrations make it easy for enterprises to embed Opus into production workflows—whether in data science, enterprise automation, or software pipelines. Everyday Performance That Scales While Sonnet 4 isn’t as advanced as Opus in reasoning depth, it delivers high-quality performance on the majority of general-purpose tasks, including summarization, conversation, content generation, and lightweight coding.
May 24, 2025 - For high-compute results, multiple ... tasks are approached at scale. Claude Opus 4 represents a major leap forward in building digital systems that can handle deep, uninterrupted thinking....
October 3, 2025 - A: Yes, the data shows clear gains on software-focused evaluations for Claude Sonnet 4.5. It retains context for much longer, which is critical for complex debugging and large code refactoring projects.
May 23, 2025 -Claude 4 Sonnet is the zippier, and is a more affordable sibling. It’s a major upgrade from Sonnet 3.7 and scores a whopping 72.7% on Sway Bench. If Opus is the marathon runner, Sonnet’s your sprinter — fast, efficient, and still wicked smart.
January 31, 2025 - In those internal evaluations, Sonnet solved 64% of coding problems, so it outperformed Opus a lot. Remember, Claude Opus solved just 38%, a weak result in my opinion. This makes Sonnet particularly useful for developers that are working on ...
May 29, 2025 -Opus 4 can “work continuously for several hours” on coding or problem-solving tasks, “dramatically outperforming all earlier Sonnet models” on long, complex coding problems.
I've been playing with the free version giving iy some requirements and testing the code it produces. While it's pretty cool to see it understand the requirements and produce ok code, i have to go thru a lot of iterations to get it to what i expect.
I wonder if Opus is significantly "smarter" when writing software based on vague'isg requirements. Please share your experience.
While it provides standard code completion and simple restructuring, it falls short in handling more intricate development needs. This version generally operates slower than Claude 3.7 Sonnet and can struggle with highly complex problem-solving ...
I work in quantitative finance, so most of my programming revolves around building financial tools that detect and exploit market anomalies. The coding I do is highly theoretical and often based on insights from academic finance research.
I’m currently exploring different models to help me reason through and validate my approaches. Does anyone have experience using Opus 4 of Sonnet 4 for this kind of work? I’m trying to figure out what is the best fit for my use case.
Claude Opus 4.5 catches more issues in code reviews without sacrificing precision. For production code review at scale, that reliability matters. Based on testing with Junie, our coding agent, Claude Opus 4.5 outperforms Sonnet 4.5 across all benchmarks. It requires fewer steps to solve tasks ...
September 29, 2025 -Sonnet 4.5 excels in computer use capabilities, reliably handling any browser-based task from competitive analysis to procurement workflows to customer onboarding. Sonnet 3.5 was the first frontier AI model to be able to use computers in this way. Sonnet 4.5 uses computers even more accurately ...
August 5, 2025 - Hey all, Anthropic has just released Claude Opus 4.1, and the model is already available for you in Cursor to try! While its behaviour should be unchanged from 4.0, Anthropic reports a higher intelligence against benchmarks. Note: Claude Opus is a very expensive model to run, and will likely ...
August 8, 2025 - If you look at the past, whenever Google announces something major, OpenAI almost always releases something as well · People forget realize that OpenAI was started to compete with Google on AI
Windsurf reports Opus 4.1 delivers a one standard deviation improvement over Opus 4 on their junior developer benchmark, showing roughly the same performance leap as the jump from Sonnet 3.7 to Sonnet 4. We recommend upgrading from Opus 4 to Opus 4.1 for all uses.
the difference using claude code with opus vs. sonnet is insane for me opus can figure out most things given sufficient time, even if inefficiently, but sonnet runs around, generates a bunch of cruft, and doesn't really get anywhere
July 13, 2025 -Sonnet 4 is perfect for business needs like chatbots, customer support, and technical text generation, combining speed and reasoning. Haiku 3.5 is irreplaceable when speed of response over large volumes or handling simple, repetitive tasks becomes ...