Really.. chatgpt has output limit Of muchhhh more than 4k. And all versions of claude have still 4k,and i know about telling Continue but it makes the message limit muchhh shorter. Please increase it to much more.
Videos
Recently, I'm having to compress 2MB files as hard as possible to even get more than 4 messages into a chat. It seems like claude is functionally useless for anything other than as an alternative to google.
I will litterally hit the limit if I attach more than 3 files to a chat. What is going on?
I'm cancelling my subscription and moving back to OpenAI, even though I hate it's guts
For context, I'm a software engineering student and this particular chat contained, I kid you not, three messages, a single 175KB file and I was trying on the 4th message to attach 2x 2MB pdf files, using 3.5 Sonnet. Compressing the files down to 600kb and it STILL won't work, even trying with a SINGLE file.
I'm getting "You message will exceed length limit, make a new chat". It's so damn awful
EDIT: So it turns out that Claude is absolute TRASH at pdfs, wasting all my tokens and all capacity on trying to process the company logo that appears on each of the 90 pages in the pdf. After fiddling around I finally got a different message specifying something like "this message exceeds image limits". What a shame
EDIT 2: People don't seem to understand that Claude advertises file uploads of 20 files, 30MB max EACH. Hitting the limit with a 600kb file should not be possible and is an enormous oversight
So I've just done the unthinkable (according to the current mood of the sub) and paid to go pro. I have a single task that I need right now, and it's something that's worked in the past on a free account - discussing a full manuscript.
Months ago, I uploaded a draft in 3 parts (because 1 was too big for a single upload). It accepted the parts totaling 150k words, and we had a very productive discourse on the contents, which was extremely helpful.
Now, I'm trying to upload 130k words for the next draft. It wouldn't even let me upload part 2 as a free customer for the past week (and sometimes not even part 1). Today, after going pro, it's telling me I'm still 14% over the limit for part 3. So going pro clearly upped the tokens, but it's still too small.
Is it possible that the 200k token limit isn't working? Or do even pro users get throttled with tiny token limits during busy times? Is the solution to try this when the US is asleep?
Thanks
EDIT: I cut the manuscript down to 127K words, cutting out all the interludes, and it manages to upload! And after a single reply, it tells me that the conversation is full and I must start a new one. Wow - so I just snuck in under the wire and can only get one paragraph of insight. I don't understand how this was working so well in July and now it seems like there's almost half the capacity, despite doubling the tokens!
It's beyond frustrating how low the limit is especially since i'm paying for the service, if you forget to open new chat every 5 minutes well good bye for the next few hours. And not only you get blocked from one chat, it gives you an offer to use Haiku but that one is blocked as well, like...thanks a lot
Here is the transcribed conversation from claude.AI: https://pastebin.com/722g7ubz
Here is a screenshot of the last response: https://imgur.com/a/kBZjROt
As you can see, it is cut off as being "over the maximum length".
I replicated the same conversation in the API workbench (including the system prompt), with 2048 max output tokens and 4096 max output tokens respectively.
Here are the responses.
2048 max output length: https://pastebin.com/3x9HWHnu
4096 max output length: https://pastebin.com/E8n8F8ga
Since claude's tokenizer isn't public, I'm relying on OAI's, but it's irrelevant whether they're perfectly accurate counts or not - I'm comparing between the responses. You can get an estimation of the claude token count by adding 20%.
Note: I am comparing just the code blocks, since they make up the VAST majority of the length.
Web UI response: 1626 OAI tokens = around 1950 claude tokens
API response (2048): 1659 OAI tokens = around 1990 claude tokens
API response (4096): 3263 OAI tokens = around 3910 claude tokens
I would call this irrefutable evidence that the webUI is limited to 2048 output tokens, now (1600 OAI tokens is likely roughly 2000 claude 3 tokens).
I have been sent (and have found on my account) examples of old responses that were obviously 4096 tokens in length, meaning this is a new change.
I have seen reports of people being able to get responses over 2048 tokens, which makes me think this is A/B testing.
This means that, if you're working with a long block of code, your cap is effectively HALVED, as you need to ask claude to continue twice as often.
This is absolutely unacceptable. I would understand if this was a limit imposed on free users, but I have Claude Pro.
EDIT: I am almost certain this is an A/B test, now. u/Incenerer posted a comment down below with instructions on how to check which "testing buckets" you're in.
https://www.reddit.com/r/ClaudeAI/comments/1f4xi6d/the_maximum_output_length_on_claudeai_pro_has/lkoz6y3/
So far, both I and another person that's limited to 2048 output tokens have this gate set as true:
{
"gate": "segment:pro_token_offenders_2024-08-26_part_2_of_3",
"gateValue": "true",
"ruleID": "id_list"
}Please test this yourself and report back!
EDIT2: They've since hashed/encrypted the name of the bucket. Look for this instead:
{
"gate": "segment:inas9yh4296j1g41",
"gateValue": "false",
"ruleID": "default"
}EDIT3: The gates and limit are now gone: https://www.reddit.com/r/ClaudeAI/comments/1f5rwd3/the_halved_output_length_gate_name_has_been/lkysj3d/
This is a good step forward, but doesn't address the main question - why were they implemented in the first place. I think we should still demand an answer. Because it just feels like they're only sorry they got caught.
Claude has this 45 messages limit per 5 hours for pro subs as well. Is there any way to get around it?
Claude has 3 models and I have been mostly using sonet. From my initial observations, these limits apply for all the models at once.
I.e., if I exhaust limit with sonet, does that even restrict me from using opus and haiku ? Is there anyway to get around it?
I can also use API keys if there’s a really trusted integrator but help?
Update on documentation: From what I’ve seen till now this doesn’t give us very stood out notice about the limitations, they mentioned that there is a limit but there is a very vague mention of dynamic nature of limitations.
Edit (18 July, 2025):
Claude has tightened the limits of Claude Code silently, people are repeatedly facing this issue :: "Invalid model. Claude Pro users are not currently able to use Opus 4 in Claude Code" and also https://github.com/anthropics/claude-code/issues/3566
Make no mistake, I love claude to the core. I was probably in the mid-early adopters of Claude. I love the Artifact generation more than anything. But this limitations are really bad. Some power users are really happy on claude Max plan because they were able to get it to work precisely. I think this is more to do with Prompt engineering, and context engineering. I hope sooner or later, claude can really work like how ChatGPT is accessible now-a-days.
Edit ( 7 sept, 2025):
The fact that this post is still getting so much attention is a testament to Claude not listening to the users. I love Claude and Claude Code too much, and I am a fan of Anthropic adding new features. Unfortunately, this Claude code also hits the “Compacting conversation” too quick - for me atleast, and the limits are a little better honestly. But the cool down period is painful.
I find myself reaching that cursed "Message limit reached for Claude 3 Opus" too often, and it's really frustrating because I've found Claude quite pleasant to interact and work with. I'm wondering, can't Anthropic at least provide the option to pay extra when needing to go over quota, rather than just being forced to stop in the middle of a productive conversation? Kind of like what phone companies do when you need more data than the package you've paid for allows...