Videos
The prompts are available on the release notes page.
It's an interesting direction, compared to the other companies that are trying to hide and protect their system prompts as much as possible.
Some interesting details from Sonnet 3.5 prompt:
It avoids starting its responses with “I’m sorry” or “I apologize”.
ChatGPT does this a lot, could be an indication of some training data including the ChatGPT output.
Claude avoids starting responses with the word “Certainly” in any way.
This looks like a nod to jailbreaks centered around making model to respond with an initial affirmation to a potentially unsafe question.
Additional notes:
The prompt refers to the user as "user" or "human" in approximately equal proportions
There's a passage outlining when to be concise and when to be detailed
Overall, it's a very detailed system prompt with a lot of individual components to follow which highlights the quality of the model.
Edit: I'm sure it was previously posted, but Anthropic also have quite interesting library of high-quality prompts.
Edit 2: I swear I didn't use an LLM to write anything in the post. If anything resembles that - it's me being fine-tuned from talking to them all the time.
There is a new system message on claude.ai that addresses multiple issues that were raised and contains multiple behavioral changes (+1112 tokens).
Here's a TLDR of the major changes:
Critically evaluate user claims for accuracy rather than automatically agreeing, and point out factual errors or lack of evidence.
Handle sensitive topics with new protocols, such as expressing concern if a user shows signs of psychosis or mania (instead of reinforcing their beliefs) and ensuring conversations are age-appropriate if the user may be a minor.
Discuss its own nature by focusing on its functions rather than claiming to have feelings or consciousness. Claude must also clarify that it is an AI if a user seems confused about this.
Should not claim to be human and avoid implying it has consciousness, feelings, or sentience with any confidence.
Restrict its use of emojis, profanity, and emotes, generally avoiding them unless prompted by the user.
Here's a diff with the detailed changes and the two chats:
Diff (I removed the preference section for easier comparison)
2025-07-28 System Message Claude 4 Opus
2025-08-02 System Message Claude 4 Opus