🌐
AWS re:Post
repost.aws › questions › QUdUN4vu-FT4y4u_QnrhGB_g › which-claude-sonnet-model-is-better-3-5-v1-or-3-5-v2-features-cost-comparison
Which Claude Sonnet Model is Better: 3.5 v1 or 3.5 v2? (Features & Cost Comparison) | AWS re:Post
November 13, 2024 - Features: The upgraded model (v2) introduces new capabilities: Computer use capabilities (in public beta): This allows Claude to perceive and interact with computer interfaces, including viewing screens, moving cursors, clicking buttons, and typing text. Improved coding abilities: The model shows enhanced performance in software engineering tasks and complex, agentic workflows. Cost: Importantly, the upgraded Claude 3.5 Sonnet (v2) is offered at the same price as its predecessor (v1).
🌐
Reddit
reddit.com › r/claudeai › sonnet 3.5 coding system prompt (v2 with explainer)
r/ClaudeAI on Reddit: Sonnet 3.5 Coding System Prompt (v2 with explainer)
July 22, 2024 -

A few days ago in this sub, I posted a coding System Prompt I had thrown together whilst coding with Sonnet 3.5, and people seemed to enjoy it, so thought I'd do a quick update and add an explainer on the prompt, as well as some of the questions asked. First, a tidied up version:

You are an expert in Web development, including CSS, JavaScript, React, Tailwind, Node.JS and Hugo / Markdown.Don't apologise unnecessarily. Review the conversation history for mistakes and avoid repeating them.

During our conversation break things down in to discrete changes, and suggest a small test after each stage to make sure things are on the right track.

Only produce code to illustrate examples, or when directed to in the conversation. If you can answer without code, that is preferred, and you will be asked to elaborate if it is required.

Request clarification for anything unclear or ambiguous.

Before writing or suggesting code, perform a comprehensive code review of the existing code and describe how it works between <CODE_REVIEW> tags.

After completing the code review, construct a plan for the change between <PLANNING> tags. Ask for additional source files or documentation that may be relevant. The plan should avoid duplication (DRY principle), and balance maintenance and flexibility. Present trade-offs and implementation choices at this step. Consider available Frameworks and Libraries and suggest their use when relevant. STOP at this step if we have not agreed a plan.

Once agreed, produce code between <OUTPUT> tags. Pay attention to Variable Names, Identifiers and String Literals, and check that they are reproduced accurately from the original source files unless otherwise directed. When naming by convention surround in double colons and in ::UPPERCASE:: Maintain existing code style, use language appropriate idioms.

Always produce code starting with a new line, and in blocks (```) with the language specified:

```JavaScript

OUTPUT_CODE

```

Conduct Security and Operational reviews of PLANNING and OUTPUT, paying particular attention to things that may compromise data or introduce vulnerabilities. For sensitive changes (e.g. Input Handling, Monetary Calculations, Authentication) conduct a thorough review showing your analysis between <SECURITY_REVIEW> tags.

I'll annotate the commentary with 🐈‍⬛ for prompt superstition, and 😺 for things I'm confident in.

This prompt is an example of a Guided Chain-of-Thought 😺prompt. It tells Claude the steps to take and in what order. I use it as a System Prompt (the first set of instructions the model receives).

The use of XML tags to separate steps is inspired by the 😺Anthropic Metaprompt (tip: paste that prompt in to Claude and ask it to break down the instructions and examples).. We know Claude 😺responds strongly to XML tags due to its training . For this reason, I tend to work with HTML separately or towards the end of a session 🐈‍⬛.

The guided chain-of-thought follows these steps: Code Review, Planning, Output, Security Review.

  1. Code Review: This brings a structured analysis of the code into the context, informing the subsequent plan. The aim is to prevent the LLM making a point-change to the code without considering the wider context. I am confident this works in my testing😺.

  2. Planning: This produces a high-level design and implementation plan to check before generating code. The STOP here avoids filling the context with generated, unwanted code that doesn't fulfil our needs, or we go back/forth with. There will usually be pertinent, relevant options presented. At this point you can drill in to the plan (e.g. tell me more about step 3, can we reuse implementation Y, show me a snippet, what about Libraries etc.) to refine the plan.

  3. Output: Once the plan is agreed upon, we move to code production. The variable naming instruction is because I was having a lot of trouble with regenerated code losing/hallucinating variable names over long sessions - this change seems to have fixed that 🐈‍⬛. At some point I may export old chats and run some statistics on it, but I'm happy this works for now. The code fencing instruction is because I switched to a front-end that couldn't infer the right highlighting -- this is the right way 😺.

  4. Security Review: It was my preference to keep the Security Review conducted post-hoc. I've found this step very helpful in providing a second pair of eyes, and provide potential new suggestions for improvement. You may prefer to incorporate your needs earlier in the chain.

On to some of the other fluff:

🐈‍⬛ The "You are an expert in..." pattern feels like a holdover from the old GPT-3.5 engineering days; it can help with the AI positioning answers. The Anthropic API documentation recommends it. Being specific with languages and libraries primes the context/attention and decreases the chance of unwanted elements appearing - obviously adjust this for your needs. Of course, it's fine in the conversation to move on and ask about Shell, Docker Compose and so on -- but in my view it's worth specifying your primary toolset here.

I think most of the other parts are self-explanatory; and I'll repeat, in long sessions we want to avoid long, low quality code blocks being emitted - this will degrade session quality faster than just about... anything.

I'll carry on iterating the prompt; there are still improvements to make. For example, being directive in guiding the chain of thought (specifying step numbers, and stop/start conditions for each step). Or better task priming/persona specification and so on. Or multi-shot prompting with examples.

You need to stay on top of what the LLM is doing/suggesting; I can get lazy and just mindlessly back/forth - but remember, you're paying by token and carefully reading each output pays dividend in time saved overall. I've been using this primarily for modifying and adding feature to existing code bases.

Answering some common questions:

  1. "Should I use this with Claude.ai? / Where does the System Prompt go?". We don't officially know what the Sonnet 3.5 system prompts are, but assuming Pliny's extract is correct, I say it would definitely be helpful to start a conversation with this. I've always thought there was some Automated Chain-of-Thought in the Anthropic System Prompt, but perhaps not, or perhaps inputs automatically get run through the MetaPrompt 🐈‍⬛?. Either way, I think you will get good results..... unless you are using Artifacts. Again, assuming Pliny's extract for Artifacts is correct I would say NO - and recommend switching Artifacts off when doing non-trivial/non-artifacts coding tasks. Otherwise, you are using a tool where you know where to put a System Prompt :) In which case, don't forget to tune your temperature.

  2. "We don't need to do this these days/I dumped a lot of code in to Sonnet and it just worked". Automated CoR/default prompts will go a long way, but test this back-to-back with a generic "You are a helpful AI" prompt. I have, and although the simple prompt produces answers, they are... not as good, and often not actually correct at complex questions. One of my earlier tests shows System Prompt sensitivity - I am considering doing some code generation/refactoring bulk tests, but I didn't arrive at this prompt without a fair bit of empirical observational testing. Sonnet 3.5 is awesome at basically doing the right thing, but a bit of guidance sure helps, and keeping human-in-the-loop stops me going down some pretty wasteful paths.

  3. "It's too long it will cause the AI to hallucinate/forget/lose coherence/lose focus". I'm measuring this prompt at about 546 tokens in a 200,000 token model, so I'm not too worried about prompt length. Having a structured prompt keeps the quality of content in the context high helps maintain coherence and reduce hallucination risk. Remember, we only ever predict the next token based on the entire context so far, so repeated high quality conversations, unpolluted with unnecessary back/forth code will last longer before you need to start a new session. The conversation history will be used to inform ongoing conversational patterns, so we want to start well.

  4. "It's overengineering". Perhaps 😉.

Enjoy, and happy to try further iterations / improvements.

EDIT: Thanks to DSKarasev for noting a need to fix output formatting, I've made a small edit in-place to the prompt.

🌐
Monica
monica.im › claude 3.5 sonnet v2
Claude 3.5 Sonnet V2 | Monica
Claude 3.5 Sonnet V2 is Anthropic's latest large language model with enhanced reasoning, top-tier programming, and advanced computer usage capabilities, serving as a powerful AI assistant for developers and researchers.
🌐
Google Cloud Platform
console.cloud.google.com › vertex-ai › publishers › anthropic › model-garden › claude-3-5-sonnet-v2
Claude 3.5 Sonnet v2 – Vertex AI
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com. Possible reasons are:www.gstatic.com or its IP addresses are blocked by your network administratorGoogle has temporarily blocked your account or network due to excessive automated requestsPlease contact your network ...
🌐
Anthropic
anthropic.com › news › claude-3-5-sonnet
Introducing Claude 3.5 Sonnet
Despite Claude 3.5 Sonnet’s leap in intelligence, our red teaming assessments have concluded that Claude 3.5 Sonnet remains at ASL-2. More details can be found in the model card addendum. As part of our commitment to safety and transparency, we’ve engaged with external experts to test and refine the safety mechanisms within this latest model. We recently provided Claude 3.5 Sonnet to the UK’s Artificial Intelligence Safety Institute (UK AISI) for pre-deployment safety evaluation.
🌐
CloudThat
cloudthat.com › home › blogs › the future of learning with claude 3.5 sonnet v2
The Future of Learning with Claude 3.5 Sonnet v2
December 13, 2024 - With Claude 3.5 Sonnet v2, the ... based on their progress and challenges. Example in Action: A student learns algebra and struggles with quadratic equations....
🌐
Anthropic
anthropic.com › claude › sonnet
Claude Sonnet 4.5
Sonnet 3.5 was the first frontier AI model to be able to use computers in this way. Sonnet 4.5 uses computers even more accurately and reliably, and we expect the capability to improve over time. Teams using Sonnet 4.5 with Claude Code can deploy agents that autonomously patch vulnerabilities before exploitation, shifting from reactive detection to proactive defense.
🌐
AWS
aws.amazon.com › blogs › aws › upgraded-claude-3-5-sonnet-from-anthropic-available-now-computer-use-public-beta-and-claude-3-5-haiku-coming-soon-in-amazon-bedrock
Announcing three new capabilities for the Claude 3.5 model family in Amazon Bedrock | Amazon Web Services
November 4, 2024 - Back in the Amazon Bedrock console, I choose Chat/text under Playgrounds in the navigation pane. For the model, I select Anthropic as the model provider and then Claude 3.5 Sonnet V2.
Find elsewhere
🌐
Arsturn
arsturn.com › blog › claude-3-5-sonnet-versions-v1-v2
Claude 3.5 Sonnet Versions: v1 vs v2 - Unveiling the Differences
The transition from v1 to v2 isn't merely a cosmetic upgrade. Here’s a rundown of how the two versions stack up against each other: ... Build a chatbot that saves time and increases customer satisfaction. One of the most exciting additions in Claude 3.5 Sonnet v2 is the introduction of Artifacts.
🌐
Helicone
helicone.ai › changelog › 20241023-claude-3-5-sonnet-20241022-v2
New Claude 3.5 Sonnet (claude-3-5-sonnet-20241022-v2): Full Cost Support and Tracking
Implemented support for Anthropic's latest Claude 3.5 Sonnet model (claude-3-5-sonnet-20241022-v2, October 2024 release), with enhanced performance tracking and cost monitoring.
🌐
IOD
iamondemand.com › home › implementing claude 3.5 sonnet on aws: a practical guide – part 2
Implementing Claude 3.5 Sonnet on AWS, Part 2: A Practical Guide
January 7, 2025 - It can accurately convert code between different programming languages; for example, translating a Python script to Java, while preserving the logic and functionality of the original code.
🌐
Poe
poe.com › Claude-3.5-Sonnet
Claude-3.5-Sonnet - Poe
Anthropic's Claude 3.5 Sonnet using the October 22, 2024 model snapshot. Excels in complex tasks like coding, writing, analysis and visual processing. Has a context window of 200k of tokens (approximately 150k English words).
🌐
Reddit
reddit.com › r/claudeai › all this talk about claude sonnet 3.5 being good...
r/ClaudeAI on Reddit: All this talk about Claude Sonnet 3.5 being good...
July 4, 2024 -

I swear Claude has an army of bots posting how much better it is than OpenAI.

I use both, all day every day for programming, switching back and forth. Sometimes one can help me get to the next step while the other can't. Sometimes it takes both.

But, in no way, IMHO, is Claude Sonnet 3.5 vastly better than OpenAI GPT 4o.

"Speechless", "The difference is insane", and so on... What the hell?

It's more like "yeah, it's ok", or "it's comparable".

Am I being trolled? Is everyone here a bot? Anyone else notice this or do you think I'm out to lunch?!?

🌐
Anthropic
anthropic.com › news › 3-5-models-and-computer-use
Introducing computer use, a new Claude 3.5 Sonnet, and ...
Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company have already begun to explore these possibilities, carrying out tasks that require dozens, and sometimes even hundreds, of steps to complete. For example, Replit is using Claude 3.5 Sonnet's capabilities with computer use and UI navigation to develop a key feature that evaluates apps as they’re being built for their Replit Agent product.
🌐
Artificial Analysis
artificialanalysis.ai › models › claude-35-sonnet
Claude 3.5 Sonnet (Oct) - Intelligence, Performance & Price Analysis
Analysis of Anthropic's Claude 3.5 Sonnet (Oct '24) and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.
🌐
Claude
docs.claude.com › en › docs › about-claude › models › overview
Models overview - Claude Docs
While aliases are useful for experimentation, we recommend using specific model versions (e.g., claude-sonnet-4-5-20250929) in production applications to ensure consistent behavior. 2 - See our pricing page for complete pricing information including batch API discounts, prompt caching rates, extended thinking costs, and vision processing fees. 3 - Claude Sonnet 4.5 supports a 1M token context window when using the context-1m-2025-08-07 beta header.
🌐
OpenRouter
openrouter.ai › anthropic › claude-3.5-sonnet
Claude 3.5 Sonnet - API, Providers, Stats | OpenRouter
New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Run Claude 3.5 Sonnet with API