I use AI a lot in cases where I need a bit more than 16k input length (GPT3.5's context window limit). GPT3.5's performance is normally fine for me, but I have to use GPT4 to get a longer context window, at a much increased inference price for the many queries I end up racking up over a long session.
The Claude 3 family of models are the first ones that seem to have very respectable performance and have longer (200k) context windows across the entire family (Opus + Sonnet + Haiku). So I'm very excited about the 'Sonnet' model (the middle quality model).
TLDR: It's exciting to see the benchmark results of Opus, but I think Sonnet might enable more new real world use cases than Opus, when considering the context window and the relatively low cost.
I've tested it a few times, and when using Claude 3 Opus through perplexity, it absolutely limits the context length from 200k to ~30k.
On a codebase of 110k tokens, using Claude 3 Opus through Perplexity, it would consistently (and I mean every time of 5 attempts) say that the last function in the program was one that was located about 30k tokens in.
When using Anthropic's API and their web chat, it consistently located the actual final function and could clearly see and recall all 110k tokens of the code.
I also tested this with 3 different books and 2 different codebases and received the same results across the board.
I understand if they have to limit context to offer it unlimited, but not saying that anywhere is a very disappointing marketing strategy. I've seen the rumors of this but I just wanted to add another data point of confirmation that the context window is limited to ~30k tokens.
Unlimited access to Claude 3 Opus is pretty awesome still, as long as you aren't hitting that context window, but this gives me misgivings about what else Perplexity is doing to my prompts under the hood in the name of saving costs.
Videos
Hi everyone,
I'm considering using the Claude-3-Opus model on Poe, but I have a question about the context window size for the 1000 credit "shortened" version compared to the full 200k token version that costs 6000 credits on Anthropic.
Since I'm located in Europe, I don't have direct access to Anthropic to use the full Opus model. So I'm trying to determine if the Poe version with the smaller context window will still meet my needs.
Does anyone happen to know approximately how many tokens the context window is limited to for Claude-3-Opus on Poe? Any insight would be greatly appreciated as I try to decide if it will be suitable for my use case.
Thanks so much for any info you can provide!
POE just doubled the credit for Claude-3. 🤬 Now Claude-3-Opus-200k require 12000 credit and Claude-3-Opus require 2000 credit per message
I have the same question regarding the nebulous context window when not using the Opus-200k. Poe upped the price per message for Claude 3 Opus-200k a few days after its launching. It used to be 1750 tokens/message and now it's 6000/message. Context window is important. It should be communicated...
I’m on Opus 4.5 with Max. Every time I add an image or try to do a slightly serious multi-step task, I get hit with “Context size exceeds the limit. I even tested with a simple single image and was still having issues, super frustrating. Also tried reducing the number of files or content in the conversation,” and was followed by the “Compacting our conversation so we can keep chatting…” spinner after just a few messages.
It was absolutely on fire the last few days – long, complex sessions with multiple files, no issues at all. Then out of nowhere, it starts compacting almost immediately, even if I’m only working off a single image. With a supposed 200k+ context window, this makes zero sense from the user side.
I’ve tried pretty much everything: Opus 4.5 on Max, desktop app, web app, different projects/folders, disabling connectors, restarting, fresh chats, different prompt styles. Same story every time as soon as the convo starts getting butchered by aggressive compaction and length limit warnings.
Is this some bug, server-side issue, or a quiet change to how they’re counting tokens, especially for images and file attachments? Anyone figured out a reliable workaround beyond “new chat every few minutes” or stripping everything down to plain text?
Would love to hear if others are seeing the same thing or if there’s a smarter way to work around these context shenanigans.
I am on the $100 plan using opus 4.5. Good experience so far but I am noticing that I am running out of context WAY faster. Not sure if this is because of Opus, because of my project, or because I downgraded from the $200 plan. Any ideas?
After the issues that had been plaguing me do the general laziness of GPT-4 I had allowed my subscription to lapse and purchased a claude 3 opus subscription from Anthropic. At first I was simply amazed at how accurate the model was compared to the then gimped GPT-4 though I quickly realized that the model and the underlying service had some key issues such as their usage policy which limits the number of prompts In a 5 hour 'at the time I signed up it was 8' period if you upload certain files to it. Which I do quite frequently since it makes it easier to provide some context for any task by uploading a file. So your 45 message limit can quickly become 10 if you don't understand how the context affects the message limit. Furthermore one of the primary selling points of Claude is its large context which is effectively Tantalian curse in the sense that the context is close yet so far we have 200k context to play with but due to the aforementioned usage policy we cannot make practical use of it.
Many will say use the API but the costs are simply absurd if you intend to make the API version of Claude your daily driver. Also Claude tends to be very verbose when it replies to you and the UI of their flagship app leaves much to be desired. Finally the lack of web browsing in Claude means you have to manually verify the output and since Claude is regarded so highly for its intellect it may result in your trusting output you shouldn't.
Throughout it all I was prepared to keep my subscription until the king returned with GPT 4 Turbo w/ vision 2024-04-09 which fixed every major issue I had with the previous model of GPT 4 that I had originally left for Claude, the clear and capable code, the ability to read files with an expanded context without issue, it all became clear that even though Claude may be superior to GPT 4 in some ways the scale of the underlying companies makes GPT 4 the superior choice. Not to mention it took the other companies so long to surpass GPT 4 that was trained on lackluster hard ware what will GPT 5 look like?
I want to use Claude 3 Opus with the full 200k context window. I feel like no other AI even comes close to it in terms of creativity and realistic narration, which is what I want as a novelist wishing for an AI assistant. So far though, I have been using C3O through Perplexity Pro only. The official Claude website doesn't have the option to edit/delete message sent, so it is a big pain in the ass for my approach, as I try multiple prompts one by one to play out scenes and see which one is more suitable for the chapter and stuff like that. Due to that I opted for Perplexity over the official platform to access Claude 3 Opus. But Perplexity doesn't seem to have the 200k context window. And it has 50 C3O uses per day.
So I am looking for an alternative way to access Claude 3 Opus 200k now with the full context window, edit/delete message option and more uses than 50 for C3O (unlimited is the dream, but I know that is not an option yet for Opus).
There have been quite a few threads discussing how these two models perform, so I thought I'd share my experience.
I've managed to get a subscription for Claude via PVN, even though Europe is not yet included. Thanks to Apple Pay as I was unable to pay via card directly due to address restrictions.
After using it for a couple of days, mostly for Python coding and a little for writing, to my surprise, I actually found it better than GPT-4.
Since I can't think of any negative aspects, I'll list what I liked instead:
-
I only hit the cap limit once, and it was probably due to a large context within the sessions rather than the message cap.
-
It always performs as you ask it. It always returns complete functions, etc.
-
The code it outputs almost always works the first time. Really nice!
-
The code, overall, looks a bit nicer and is better organized, with more relevant variable names, etc.
Given that the price is the same, I think it's a much better deal. Well, at least for now, until we have GPT-4.5.
It's really worth a shot :)
I find myself reaching that cursed "Message limit reached for Claude 3 Opus" too often, and it's really frustrating because I've found Claude quite pleasant to interact and work with. I'm wondering, can't Anthropic at least provide the option to pay extra when needing to go over quota, rather than just being forced to stop in the middle of a productive conversation? Kind of like what phone companies do when you need more data than the package you've paid for allows...
Signed everyone that used Claude to write software. At least give us an option to pay for it.
Edit: thank you Anthropic!
I've looked around for this answer and I see a lot of conflicting things. Some people at least are saying that you only actually get 200k context with Opus if you're using the API. Normal Pro users just using Claude on the website only get a portion of that. That's what some people say, and I'm just curious if anyone knows the real answer.
The reason I ask is, I'm using Opus as a Pro member on the website, and even when I'm only like 15k tokens into a conversation, it starts telling me the conversation is getting too long and I should start a new one. 15k seems like a very tiny piece of 200k.
With the introduction of Opus 4.5, Anthropic just updated the Claude Apps (Web, Desktop, Mobile):
For Claude app users, long conversations no longer hit a wall—Claude automatically summarizes earlier context as needed, so you can keep the chat going.
This is so amazing and was my only gripe I had with Claude (besides limits), and why I kept using ChatGPT (for the rolling context window).
Anyone as happy as I am?
I’ve been using gpt 4 for quite a very long time and what I observe it’s just stupid when it comes to context window and memory allocation, whereas Claude is actually brilliant.
I'm on the 20x Max plan. I get that Opus will use tokens faster, and Anthropic acknowledged this by increasing the total token usage to be something equivalent to the same amount of usage you'd get with Haiku (whether or not that is really true remains to be seen). However, they didn't raise the context window token limit of 200k (I don't have access to the 1M limit).
I just used my first prompt (which was a pretty standard one for me) to help find an issue that threw an error on my front-end, and after its response (which wasn't that helpful), I'm already down to 9% context remaining before auto-compacting.
If Anthropic is going to acknowledge that token consumption will be higher with Opus and scale some of the limits up accordingly, they really should increase the context limit as well.
We have to be doing better than FELON TUSK , right? Right?
I'm wondering when it makes sense to access Claude 3 Opus directly through Anthropic or via Perplexity.
My main priority is context window. If I can achive maximum context window via Perplexity, that seems like a good deal to me.
Edit: Perplexity answered this question. Context window is limited when accessing Claude via Perplexity: https://www.perplexity.ai/search/What-is-the-lLCnpBrnQJuNttxgo8aXbw