Open Router Cost per different AI Models Comparison
Price / Cost comparison of popular LLM's from an openrouter/API perspective. Thoughts on monthly fee's. (long)
Feature Request: Get OpenRouter pricing from API
Which API is more cost-effective? Direct DeepSeek API, OpenRouter, or Chutes?
Videos
So I'm beginning to use the different AI's on Open Router.
The main one's I've been using are GPT4o, Claude Sonnet, Claude Haiku and Command R Plus.
-GPT4o and Command R Plus give great responses, but cost a lot.
-Claude Sonnet gives longer responses then Claude Haiku, but both work fairly well unless it's NSFW stuff (to which I use a JB) and don't cost as much. I tired Claude Opus, but man that was expensive as shit and probably won't use that again.
So from I've been seeing about cost are this
-GPT4o (0.015 cents)
-Command R Plus (0.015 cents)
-Claude Sonnet (0.010 cents)
-Claude Haiku (0.001 cents)
Does that seem around what anyone else spends for reach response? Note, I keep my Min tokens at 300, with Max token at 350.
Hello everyone,
Given the discussion of pricing, tiers and costs of these AI sites going around. I figured some hard numbers would be helpful so people can understand what we’re looking at. It might give some perspective.
As someone that uses OpenRouter for API key access to LLM’s. I pay money and choose what models I want to interact with, each chat in/out costs money based upon that specific model. While we don’t know what kind of pricing a chatbot site will get if they host their own model, rent servers, bulk discounts, etc etc. It gives us an understanding of the cost of these LLM’s from a highly popular site.
TLDR:
If you come home at night and chat a bit, have a 300 response chat and do this daily.
Monthly cost - Personal - This is the price it COSTS ME to chat straight with the LLM, daily at 300chat responses. I can do WAY more than this, though. It’s why I don't have an issue paying $10 for a service (or more, for good models) I would pay that through API if I wasnt giving it to a good site which provided me all kinds of other services along with the access to the LLM
DS V3 = $2.61/m
2.0 Flash = $1.49/m
Llama = $1.149/m
2.5 Pro = $28.98/m
Nemo = $0.0051/m
Tokens vs Characters
The first thing everyone has to understand is tokens vs characters.
Tokens are units of text (words, subwords, or punctuation) used by LLMs, where 1 token ≈ 4 characters or 0.75 words in English. Characters are individual letters, numbers, or symbols. Tokens group characters for processing
Just remember. "hello" = 1 token and 5 characters
These are the top 20 most popular LLM’s being used from OpenRouter for roleplay.
Now here is a chatroom which I dropped a query to these bots and got varied response lengths in order to generate costs to tokens so we have an easily organized section to view. Costs
If we take an average response length of 300 tokens from a chatbot during a conversation. Let’s look at some numbers. For this example we will ignore the cost IN, which is usually pretty minimal and focus on cost OUT.
Lets start out with Deepseek, 2.0 flash, and llama 3.1 (widely used for fine tunes). Widely considered the benchmark for roleplay, as we can see by how popular it is. These are widely popular because they offer a very high standard of roleplay at a cheaper price.
Deepseek v3
275 tokens = $.000266 / 275 * 300 = $.00029 each response * 300 chat responses = $.087 for a 300 chat session
Gemini 2.0 Flash
426 tokens = $.000236 / 426 * 300 = $.00016 each response * 300 chat responses = $.049 for a 300 chat session
Llama 3.1
270 tokens = $.000115 / 270 * 300 = $.00012 each response * 300 chat responses = $0.038 for a 300 chat session
Now let’s look at your expensive options
Claude Sonnet 4
We won’t bother with Claude because it's ridiculously expensive, just check the image. Its about 2x more expensive than Gemini 2.5 Pro
Gemini 2.5 pro
2068 tokens = .0222 / 2068 * 300 = $.0032 each response * 300 chat responses = $.966 for a 300 chat session
Now a cheap option. You typically see your Mistral, Hermes as finetunes
Mistral Nemo
218 tokens = .000000417 / 218 * 300 = broken * 300 chat responses = $ 0.00017 for a 300 chat session
This is why you usually see Nemos, Hermes for free. They’re dirt cheap. But you know you’re chatting on something like this rather than Flash or Deepseek, unless they are very good finetunes.
If you come home at night and chat a bit, have a 300 response chat and do this daily.
Monthly cost - Personal - This is the price it COSTS ME to chat straight with the LLM, daily at 300chat responses. I can do WAY more than this, though. It’s why I don't have an issue paying $10 for a service (or more, for good models)
DS V3 = $2.61
2.0 Flash = $1.49
Llama = $1.149
2.5 Pro = $28.98
Nemo = $0.0051
So. People want to bitch about spending $3-4 a month (or even $10/m) and expect this service for FREE! When we can clearly see that even if a site offers Nemo/Hermes or other base models for free, that is still money straight from their pocket. Let’s ignore the fact that most people want the better models for free.
I’ve heard Jupiter costs 50k/month.
Now, we need to understand that AI sites do NOT run their costs through a service like open router. They will typically pay a cost for servers and host the LLM on those servers, doing their own fine tune.
Many AI chatbot sites rely on their subscribers to offset the cost of their free services. Sites like CS have a good Subscriber base so they can afford to offer some more bots for free, but that is still a negative for the site. Since Xoul had such a large freemium base to start, not as many people had reason to subscribe.
Anyways. If you made it this far... what's wrong with you? Ha! Love to all and I'm excited for Xoul to return. Hope I could provide SOME perspective.
IN SUMMARY: If I'm averaging about 300 requests per day for the latest R1 version, how long will my 10$ last if I use Direct Deepseek API, and is that deal better than OpenRouter or Chutes? And, is DeepSeek portal no longer censoring their uncensored model's output?
Need help and would greatly appreciate your inputs.
Hello! I'm currently trying to compute and weigh out my options for API. Currently, I'm planing to spend 10$ or less for credits, and hopefully no repeat purchase if I can help it. This is for Deepseek R1 0528 model.
I'm having trouble quantifying the costs using per tokens basis. It's much easier to compute how much it costs per 100 requests or something like that. Or for example, how much does a person in our community usually spends on direct DeepSeek API for R1 per month, and how long does your chats usually go? How many messages?
I'm trying to compute which one is more cost-effective:
1. 1000 daily requests limit for free models in OpenRouter, with 10$ maintaining balance, and questionable expiry date as per their TOS.
They say "reserves the right", so it's unclear if they will actually expire it automatically after 365 days or not, or if I can just use the 1000 daily request limit even after 365 days. Please see attached image and kindly clarify if you know the deeper details.
2. Chutes with 5$ one-time payment with 200 requests daily limit for free models.
I wasn't able to confirm the 200 daily requests limit as it is not written anywhere I look in the website (I didn't create an account yet), or if the credits will expire as well if unused for a certain amount of time, AND, if I have to repurchase if it does expire. To my understanding it should be a one-time payment, but I would greatly appreciate correction if this was wrong.
3. Just spend it directly on DeepSeek API, even if it's not free, and have no limit aside from my actual credits.
I have no actual statistical data about this, hence why I would greatly appreciate it if someone can share their usage and its corresponding costs per month if it's possible. I just want to know how long will my 10$ lasts if I paid for direct DeepSeek API. There's also that discussion before where some users say they experience some form of censorship when using direct DeepSeek API, and would appreciate if someone could confirm if this is true or if they finally completely removed the censorship from their servers/portal.