Has the context window been reduced for Claude Code?

reddit.com › r › ClaudeAI › comments › 1pjs0vg › has_the_context_window_been_reduced_for_claude

I think the context window is still 200k Answer from thirtyfour41 on reddit.com

reddit.com › r/claudecode › the context window simply needs to be larger with opus as the default

r/ClaudeCode on Reddit: The context window simply needs to be larger with Opus as the default

1 month ago -

I'm on the 20x Max plan. I get that Opus will use tokens faster, and Anthropic acknowledged this by increasing the total token usage to be something equivalent to the same amount of usage you'd get with Haiku (whether or not that is really true remains to be seen). However, they didn't raise the context window token limit of 200k (I don't have access to the 1M limit).

I just used my first prompt (which was a pretty standard one for me) to help find an issue that threw an error on my front-end, and after its response (which wasn't that helpful), I'm already down to 9% context remaining before auto-compacting.

If Anthropic is going to acknowledge that token consumption will be higher with Opus and scale some of the limits up accordingly, they really should increase the context limit as well.

Top answer

1 of 3

Opus 4.5 does not use tokens faster. In fact in many cases, Anthropic tests showed Opus uses fewer tokens (because it gets to the right answer more directly). Set to a medium effort level, Opus 4.5 matches Sonnet 4.5’s best score on SWE-bench Verified, but uses 76% fewer output tokens. At its highest effort level, Opus 4.5 exceeds Sonnet 4.5 performance by 4.3 percentage points—while using 48% fewer tokens. And since Opus 4.5 is also less expensive per token than Opus 4.1, this means it often costs the same or less in $$ when paying for raw tokens for Opus 4.5 than for Sonnet 4.5. That is why their raised the usage limits. The old limits were to stop people from using a very expensive and not more efficient model. The new model is more efficient and more effective and doesn't cost them more on average.

2 of 3

Preserve the main context for actual useful work - Use sub agents, the main should basically be the orchestration and leave all the 'grunt work' to subs.

reddit.com › r/singularity › claude 3 context window is a big deal

r/singularity on Reddit: Claude 3 context window is a big deal

March 4, 2024 -

I use AI a lot in cases where I need a bit more than 16k input length (GPT3.5's context window limit). GPT3.5's performance is normally fine for me, but I have to use GPT4 to get a longer context window, at a much increased inference price for the many queries I end up racking up over a long session.

The Claude 3 family of models are the first ones that seem to have very respectable performance and have longer (200k) context windows across the entire family (Opus + Sonnet + Haiku). So I'm very excited about the 'Sonnet' model (the middle quality model).

TLDR: It's exciting to see the benchmark results of Opus, but I think Sonnet might enable more new real world use cases than Opus, when considering the context window and the relatively low cost.

Top answer

1 of 5

Context length is one thing, but the LLMs differ greatly by attention distribution. Many suffer from the needle in the haystack problem, ie. how well can they recall from the middle of a long prompt input. Let's see how Claude 3 handles this, 2.1 was already pretty good at it. https://www.anthropic.com/news/claude-2-1-prompting

2 of 5

Yup, I just busted through a huge stack of notes that I couldn't easily get to proc in gpt4. Hot and this is amazing. This all just goes to show you how fast this shi shi can move.

Videos