Brave Search

docs.aws.amazon.com › amazon bedrock › user guide › evaluate the performance of amazon bedrock resources › evaluate model performance using another llm as a judge › use metrics to understand model performance › built-in metric evaluator prompts for model-as-a-judge evaluation jobs › anthropic claude 3.5 sonnet v2

Anthropic Claude 3.5 Sonnet v2 - Amazon Bedrock

3. Soundness of reasoning (not claims): - Base the evaluation on the provided assumptions, regardless of their truth. 4. Logical cohesion vs correctness: - Focus on the reasoning process, not the final answer's accuracy. - Penalize flawed reasoning even if the answer is correct. 5.

Anthropic

anthropic.com › news › claude-3-5-sonnet

Introducing Claude 3.5 Sonnet

Our models are subjected to rigorous testing and have been trained to reduce misuse. Despite Claude 3.5 Sonnet’s leap in intelligence, our red teaming assessments have concluded that Claude 3.5 Sonnet remains at ASL-2.

Discussions

Which Claude Sonnet Model is Better: 3.5 v1 or 3.5 v2? (Features & Cost Comparison)

Hello, I’m evaluating the Claude Sonnet 3.5 models and would appreciate insights on the differences between 3.5 v1 and 3.5 v2 in terms of features, performance, and cost. I’d like to understand whi... More on repost.aws

repost.aws

November 13, 2024

Claude Sonnet 3.5 v2 is retiring on October 22, 2025

Release date for Claude 4.5 incoming More on reddit.com

r/ClaudeAI

September 22, 2025

Computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku

As someone building AI SaaS products, I used to have the position that directly integrating with APIs is going to get us most of the way there in terms of complete AI automation · I wanted to take at stab at this problem and started researching some daily busineses and how they use software More on news.ycombinator.com

news.ycombinator.com

735

1454

September 25, 2024

Sonnet 3.5 Coding System Prompt (v2 with explainer)

Thanks mate, I’ve been using your prompt for some projects and it’s really helpful. More on reddit.com

r/ClaudeAI

282

July 22, 2024

Videos

08:46

YouTube

Claude Rickrolled me! - Sonnet 3.5 v2 Computer Control - YouTube

October 24, 2024

12:29

YouTube

Claude 3.5 Sonnet NEW is Really Good - Full Test - YouTube

NEW 3.5 SONNET V2 Has a LOGIC BUG: Reasoning ERROR - YouTube

October 25, 2024

25:39

YouTube

Can I Really Order a Pizza With Claude Sonnet 3.5 - Computer Use?

monica.im › claude 3.5 sonnet v2

Claude 3.5 Sonnet V2 | Monica

Claude 3.5 Sonnet V2 is Anthropic's latest large language model with enhanced reasoning, top-tier programming, and advanced computer usage capabilities, serving as a powerful AI assistant for developers and researchers.

AWS re:Post

repost.aws › questions › QUdUN4vu-FT4y4u_QnrhGB_g › which-claude-sonnet-model-is-better-3-5-v1-or-3-5-v2-features-cost-comparison

Which Claude Sonnet Model is Better: 3.5 v1 or 3.5 v2? (Features & Cost Comparison) | AWS re:Post

November 13, 2024 - Based on the available information, the upgraded Claude 3.5 Sonnet model (which we can consider as v2) is superior to its predecessor (v1) in terms of features and performance, while maintaining the same cost.

Poe

poe.com › Claude-3.5-Sonnet

Claude-3.5-Sonnet - Poe

Anthropic's Claude 3.5 Sonnet using the October 22, 2024 model snapshot. Excels in complex tasks like coding, writing, analysis and visual processing. Has a context window of 200k of tokens (approximately 150k English words).

reddit.com › r/claudeai › claude sonnet 3.5 v2 is retiring on october 22, 2025

r/ClaudeAI on Reddit: Claude Sonnet 3.5 v2 is retiring on October 22, 2025

September 22, 2025 -

Email I just got:

Hello,

We're reaching out because you recently used Claude Sonnet 3.5 v2 (claude-3-5-sonnet-20241022).

Starting October 22, 2025 at 9AM PT, Anthropic is retiring and will no longer support Claude Sonnet 3.5 v2 (claude-3-5-sonnet-20241022). You must upgrade to a newer, supported model by this date to avoid service interruption.

We regularly retire earlier models to prioritize serving customers our most capable, highest quality models. Please see our deprecation policy to learn more.

In the lead up to retiring this model in October, you may experience decreased availability, as well as errors if you accelerate usage too quickly. You can learn more here.

To avoid service interruption and take advantage of our latest model capabilities, we recommend upgrading to our state-of-the-art model, Claude Sonnet 4, which offers significantly improved intelligence at the same price.

To get started with upgrading to our latest models, please explore our developer docs.

If you have questions or require assistance, please reach out to our support team via the message icon in the lower right corner of our Help Center.

— The Anthropic Team

Top answer

1 of 5

Release date for Claude 4.5 incoming

2 of 5

Makes me wonder how long it will take for Claude 4 to be approved for bedrock gov?

AWS

aws.amazon.com › blogs › aws › upgraded-claude-3-5-sonnet-from-anthropic-available-now-computer-use-public-beta-and-claude-3-5-haiku-coming-soon-in-amazon-bedrock

Announcing three new capabilities for the Claude 3.5 model family in Amazon Bedrock | Amazon Web Services

November 4, 2024 - Back in the Amazon Bedrock console, I choose Chat/text under Playgrounds in the navigation pane. For the model, I select Anthropic as the model provider and then Claude 3.5 Sonnet V2.

Claude Docs

platform.claude.com › docs › en › about-claude › models › overview

Models overview - Claude Docs

Starting with Claude Sonnet 4.5 and all future models, AWS Bedrock and Google Vertex AI offer two endpoint types: global endpoints (dynamic routing for maximum availability) and regional endpoints (guaranteed data routing through specific geographic regions).

Find elsewhere

Talk with Claude, an AI assistant from Anthropic

Amazon

aboutamazon.com › news › aws › amazon-bedrock-anthropic-ai-claude-3-5-sonnet

Amazon Bedrock introduces Claude 3.5 Haiku and an upgraded Claude 3.5 Sonnet, Anthropic’s most intelligent AI models to date

November 4, 2024 - Claude 3.5 Haiku and Claude 3.5 ... in Amazon Bedrock. The upgraded Claude 3.5 Sonnet also includes a groundbreaking new computer use capability in public beta....

OpenRouter

openrouter.ai › anthropic › claude-3.5-sonnet

Claude 3.5 Sonnet - API, Providers, Stats | OpenRouter

New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Run Claude 3.5 Sonnet with API

DataCamp

datacamp.com › blog › claude-sonnet-anthropic

What Is Claude 3.5 Sonnet? How It Works, Use Cases, and Artifacts | DataCamp

July 19, 2024 - Claude 3.5 Sonnet outperforms GPT-4o and Gemini Pro 1.5 in several benchmarks and introduces a cool new feature: Artifacts.

Hacker News

news.ycombinator.com › item

Computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku | Hacker News

September 25, 2024 - As someone building AI SaaS products, I used to have the position that directly integrating with APIs is going to get us most of the way there in terms of complete AI automation · I wanted to take at stab at this problem and started researching some daily busineses and how they use software

Helicone

helicone.ai › changelog › 20241023-claude-3-5-sonnet-20241022-v2

New Claude 3.5 Sonnet (claude-3-5-sonnet-20241022-v2): Full Cost Support and Tracking

October 23, 2024 - Implemented support for Anthropic's latest Claude 3.5 Sonnet model (claude-3-5-sonnet-20241022-v2, October 2024 release), with enhanced performance tracking and cost monitoring.

Wikipedia

en.wikipedia.org › wiki › Claude_(language_model)

Claude (language model) - Wikipedia

5 days ago - An upgraded version of Claude 3.5 Sonnet, billed as "Claude 3.5 Sonnet (New)", was introduced on October 22, 2024, along with Claude 3.5 Haiku. A feature, "computer use," was also released in public beta. This allowed Claude 3.5 Sonnet to interact with a computer's desktop environment by moving ...

reddit.com › r/claudeai › sonnet 3.5 coding system prompt (v2 with explainer)

r/ClaudeAI on Reddit: Sonnet 3.5 Coding System Prompt (v2 with explainer)

July 22, 2024 -

A few days ago in this sub, I posted a coding System Prompt I had thrown together whilst coding with Sonnet 3.5, and people seemed to enjoy it, so thought I'd do a quick update and add an explainer on the prompt, as well as some of the questions asked. First, a tidied up version:

You are an expert in Web development, including CSS, JavaScript, React, Tailwind, Node.JS and Hugo / Markdown.Don't apologise unnecessarily. Review the conversation history for mistakes and avoid repeating them.

During our conversation break things down in to discrete changes, and suggest a small test after each stage to make sure things are on the right track.

Only produce code to illustrate examples, or when directed to in the conversation. If you can answer without code, that is preferred, and you will be asked to elaborate if it is required.

Request clarification for anything unclear or ambiguous.

Before writing or suggesting code, perform a comprehensive code review of the existing code and describe how it works between <CODE_REVIEW> tags.

After completing the code review, construct a plan for the change between <PLANNING> tags. Ask for additional source files or documentation that may be relevant. The plan should avoid duplication (DRY principle), and balance maintenance and flexibility. Present trade-offs and implementation choices at this step. Consider available Frameworks and Libraries and suggest their use when relevant. STOP at this step if we have not agreed a plan.

Once agreed, produce code between <OUTPUT> tags. Pay attention to Variable Names, Identifiers and String Literals, and check that they are reproduced accurately from the original source files unless otherwise directed. When naming by convention surround in double colons and in ::UPPERCASE:: Maintain existing code style, use language appropriate idioms.

Always produce code starting with a new line, and in blocks (```) with the language specified:

```JavaScript

OUTPUT_CODE

```

Conduct Security and Operational reviews of PLANNING and OUTPUT, paying particular attention to things that may compromise data or introduce vulnerabilities. For sensitive changes (e.g. Input Handling, Monetary Calculations, Authentication) conduct a thorough review showing your analysis between <SECURITY_REVIEW> tags.

I'll annotate the commentary with 🐈‍⬛ for prompt superstition, and 😺 for things I'm confident in.

This prompt is an example of a Guided Chain-of-Thought 😺prompt. It tells Claude the steps to take and in what order. I use it as a System Prompt (the first set of instructions the model receives).

The use of XML tags to separate steps is inspired by the 😺Anthropic Metaprompt (tip: paste that prompt in to Claude and ask it to break down the instructions and examples).. We know Claude 😺responds strongly to XML tags due to its training . For this reason, I tend to work with HTML separately or towards the end of a session 🐈‍⬛.

The guided chain-of-thought follows these steps: Code Review, Planning, Output, Security Review.

Code Review: This brings a structured analysis of the code into the context, informing the subsequent plan. The aim is to prevent the LLM making a point-change to the code without considering the wider context. I am confident this works in my testing😺.
Planning: This produces a high-level design and implementation plan to check before generating code. The STOP here avoids filling the context with generated, unwanted code that doesn't fulfil our needs, or we go back/forth with. There will usually be pertinent, relevant options presented. At this point you can drill in to the plan (e.g. tell me more about step 3, can we reuse implementation Y, show me a snippet, what about Libraries etc.) to refine the plan.
Output: Once the plan is agreed upon, we move to code production. The variable naming instruction is because I was having a lot of trouble with regenerated code losing/hallucinating variable names over long sessions - this change seems to have fixed that 🐈‍⬛. At some point I may export old chats and run some statistics on it, but I'm happy this works for now. The code fencing instruction is because I switched to a front-end that couldn't infer the right highlighting -- this is the right way 😺.
Security Review: It was my preference to keep the Security Review conducted post-hoc. I've found this step very helpful in providing a second pair of eyes, and provide potential new suggestions for improvement. You may prefer to incorporate your needs earlier in the chain.

On to some of the other fluff:

🐈‍⬛ The "You are an expert in..." pattern feels like a holdover from the old GPT-3.5 engineering days; it can help with the AI positioning answers. The Anthropic API documentation recommends it. Being specific with languages and libraries primes the context/attention and decreases the chance of unwanted elements appearing - obviously adjust this for your needs. Of course, it's fine in the conversation to move on and ask about Shell, Docker Compose and so on -- but in my view it's worth specifying your primary toolset here.

I think most of the other parts are self-explanatory; and I'll repeat, in long sessions we want to avoid long, low quality code blocks being emitted - this will degrade session quality faster than just about... anything.

I'll carry on iterating the prompt; there are still improvements to make. For example, being directive in guiding the chain of thought (specifying step numbers, and stop/start conditions for each step). Or better task priming/persona specification and so on. Or multi-shot prompting with examples.

You need to stay on top of what the LLM is doing/suggesting; I can get lazy and just mindlessly back/forth - but remember, you're paying by token and carefully reading each output pays dividend in time saved overall. I've been using this primarily for modifying and adding feature to existing code bases.

Answering some common questions:

"Should I use this with Claude.ai? / Where does the System Prompt go?". We don't officially know what the Sonnet 3.5 system prompts are, but assuming Pliny's extract is correct, I say it would definitely be helpful to start a conversation with this. I've always thought there was some Automated Chain-of-Thought in the Anthropic System Prompt, but perhaps not, or perhaps inputs automatically get run through the MetaPrompt 🐈‍⬛?. Either way, I think you will get good results..... unless you are using Artifacts. Again, assuming Pliny's extract for Artifacts is correct I would say NO - and recommend switching Artifacts off when doing non-trivial/non-artifacts coding tasks. Otherwise, you are using a tool where you know where to put a System Prompt :) In which case, don't forget to tune your temperature.
"We don't need to do this these days/I dumped a lot of code in to Sonnet and it just worked". Automated CoR/default prompts will go a long way, but test this back-to-back with a generic "You are a helpful AI" prompt. I have, and although the simple prompt produces answers, they are... not as good, and often not actually correct at complex questions. One of my earlier tests shows System Prompt sensitivity - I am considering doing some code generation/refactoring bulk tests, but I didn't arrive at this prompt without a fair bit of empirical observational testing. Sonnet 3.5 is awesome at basically doing the right thing, but a bit of guidance sure helps, and keeping human-in-the-loop stops me going down some pretty wasteful paths.
"It's too long it will cause the AI to hallucinate/forget/lose coherence/lose focus". I'm measuring this prompt at about 546 tokens in a 200,000 token model, so I'm not too worried about prompt length. Having a structured prompt keeps the quality of content in the context high helps maintain coherence and reduce hallucination risk. Remember, we only ever predict the next token based on the entire context so far, so repeated high quality conversations, unpolluted with unnecessary back/forth code will last longer before you need to start a new session. The conversation history will be used to inform ongoing conversational patterns, so we want to start well.
"It's overengineering". Perhaps 😉.

Enjoy, and happy to try further iterations / improvements.

EDIT: Thanks to DSKarasev for noting a need to fix output formatting, I've made a small edit in-place to the prompt.