Maybe. But for me another good thing about this price is that it drastically cuts down the completely idiotic "tests" that people on social media do on these models, that tests nothing but their own ignorance. Answer from Necessary_Image1281 on reddit.com
🌐
Reddit
reddit.com › r/openai › gpt-4.5 has an api price of $75/1m input and $150/1m output. chatgpt plus users are going to get 5 queries per month with this level of pricing.
r/OpenAI on Reddit: GPT-4.5 has an API price of $75/1M input and $150/1M output. ChatGPT Plus users are going to get 5 queries per month with this level of pricing.
February 27, 2025 - Because a small number of users are willing to pay an insane amount money just to play around with the best available, it's in OpenAI's best interest to release models often as long as they're able to hype up the new releases ... In talking with people familiar with it. It is actually extremely expensive to run. It's giant by comparison to 4o and even larger than 4.0.
🌐
Reddit
reddit.com › r/thinkingdeeplyai › i analyzed the ai api price war between open ai, google and anthropic. here’s the brutal truth for devs and founders. it's the golden age of cheap ai
r/ThinkingDeeplyAI on Reddit: I analyzed the AI API Price War between Open AI, Google and Anthropic. Here’s the brutal truth for devs and founders. It's the Golden Age of Cheap AI
June 15, 2025 -

I just went down a rabbit hole analyzing the 2025 AI API landscape, comparing the complicating API costs for OpenAI, Google, and Anthropic. The competition is absolutely brutal, prices are really low right now, and capabilities are exploding!

I’ve crunched the numbers and summarized the key takeaways for everyone from indie hackers to enterprise architects. I’m attaching some of the key charts from the analysis to this post.

TL;DR: The 3 Big Takeaways

  • AI is stupidly cheap right now. For most apps, the API cost is a rounding error. Google in particular is destroying the competition on price. If you’ve been waiting to build, stop. This might be the cheapest AI will ever be.

  • There is NO single “best” provider. Anyone telling you "just use X" is wrong. The "best" model depends entirely on the specific task. The winner for summarizing a document is different from the winner for powering a chatbot.

  • The smartest strategy is a "Multi-Model World." The best companies are building a routing layer that picks the most cost-effective model for each specific API call. Vendor lock-in is the enemy.

Have a read through the 12 infographics attached that give some great metric comparisons across the providers

Part 1: The Three Tiers of AI: Brains, All-Rounders, and Sprinters

The market has clearly split into three categories. Knowing them is the first step to not overpaying.

  1. The Flagship Intelligence (The "Brain"): This is Anthropic's Claude 4 Opus, OpenAI's GPT-4o, and Google's Gemini 2.5 Pro. They are the most powerful, best at complex reasoning, and most expensive. Use them when quality is non-negotiable.

  2. The Balanced Workhorses (The "All-Rounder"): This is the market's sweet spot. Models like Anthropic's Claude 4 Sonnet, OpenAI's GPT-4o, and Google's Gemini 1.5 Pro offer near-flagship performance at a much lower cost. This is your default tier for most serious business apps.

  3. The Speed & Cost-Optimized (The "Sprinter"): These models are ridiculously fast and cheap. Think Anthropic's Claude 3.5 Haiku, OpenAI's GPT-4o mini, and Google's Gemini 1.5 Flash. They're perfect for high-volume, simple tasks where per-transaction cost is everything.

Part 2: The Price Isn't the Whole Story (TCO is King)

One of the biggest mistakes is picking the API with the lowest price per token. The real cost is your Total Cost of Ownership (TCO).

Consider a content marketing agency generating 150 blog posts a month.

  • Strategy A (Cheaper API): Use a workhorse model like GPT-4o. The API bill is low, maybe ~$50. But if the output is 7/10 quality, a human editor might spend 4 hours per article fixing it. At $50/hr, that's $30,000 in labor.

  • Strategy B (Premium API): Use a flagship model like Claude 4 Opus, known for high-quality writing. The API bill is higher, maybe ~$250. But if the output is 9/10 quality and only needs 2 hours of editing, the labor cost drops to $15,000.

Result: Paying 5x more for the API saved the company nearly $15,000 in total workflow cost. Don't be penny-wise and pound-foolish. Match the model quality to your workflow's downstream costs.

Part 3: The Great Context Window Debate: RAG vs. "Prompt Stuffing"

This is a huge one for anyone working with large documents. The context window sizes alone tell a story: Google Gemini: up to 2M tokens, Anthropic Claude: 200K tokens, OpenAI GPT-4: 128K tokens.

  • The Old Way (RAG - Retrieval-Augmented Generation): You pre-process a huge document, break it into chunks, and store it in a vector database. When a user asks a question, you find the most relevant chunks and feed just those to the model.

    • Pro: Very cheap per query, fast responses.

    • Con: Complex to build and maintain. A big upfront investment in developer time.

  • The New Way (Long-Context / "Prompt Stuffing"): With models like Google's Gemini, you can just stuff the entire document (or book, or codebase) into the prompt and ask your question.

    • Pro: Incredibly simple to develop. Go from idea to production way faster.

    • Con: Can be slower and MUCH more expensive per query.

The trade-off is clear: Developer time (CapEx) vs. API bills (OpEx). The reports show for an enterprise research assistant querying a 1,000-page document 1,000 times a month, the cost difference is staggering: RAG is ~$28/month vs. the naive Long-Context approach at ~$1,680/month.

Part 4: Who Wins for YOUR Use Case?

Let's get practical.

  • For the Hobbyist / Indie Hacker: Cost is everything. Start with Google's free tier for Gemini. If you need to pay, OpenAI's GPT-4o mini or Google's Gemini 1.5 Flash will cost you literal pennies a month.

  • For the Small Business (e.g., Customer Service Chatbot): This is the "workhorse" battleground. For a chatbot handling 5,000 conversations a month, the cost difference is stark:

    • Google Gemini 1.5 Pro: ~$38/month

    • Anthropic Claude 4 Sonnet: ~$105/month

    • OpenAI GPT-4o: ~$125/month

    • Verdict: Google is the aggressive price leader here, offering immense value.

  • For the Enterprise: It's all about architecture. For frequent tasks, a RAG system with a cheap, fast model is the most cost-effective. For one-off deep analysis of massive datasets, the development-time savings from Google Gemini's huge context window is the key selling point.

Part 5: Beyond Text - The Multimodal Battleground

  • Images: It's a tight race. Google's Imagen 3 is cheapest for pure generation at a flat $0.03 per image. OpenAI's DALL-E/GPT-Image offers more quality tiers ($0.01 to $0.17), giving you control. Both are excellent for image analysis. Anthropic isn't in this race yet.

  • Audio: OpenAI's Whisper remains a go-to for affordable, high-quality transcription (~$0.006/minute). Google has a robust, competitively priced, and deeply integrated audio API for speech-to-text and text-to-speech.

  • Video: Google is the undisputed leader here. They are the only one with a publicly priced video generation model (Veo 2 at $0.35/second) and native video analysis in the Gemini API. If your app touches video, you're looking at Google.

Controversial Take: Is Claude Overpriced?

Let's be blunt. Claude Opus 4 costs $75.00 per million output tokens. GPT-4o costs $15.00. Gemini 2.0 Flash costs $0.40. That means Claude's flagship is 5x more expensive than OpenAI's and over 180x more expensive than Google's fast model.

Yes, Claude is excellent for some long-form writing and safety-critical tasks. But is it 5x to 180x better? For most use cases, the answer is a hard no. It feels like luxury car pricing for a slightly better engine, and for many, it's a premium trap.

Final Thoughts: The Golden Age of Cheap AI

Google is playing chess while others play checkers. They are weaponizing price to gain market share, and it's working. They offer the cheapest pricing, the largest context windows, and full multimodal support.

This is likely the cheapest AI will ever be. We're in the "growth at all costs" phase of the market. Once adoption plateaus, expect prices to rise. The single best thing you can do is build a simple abstraction layer in your app so you can swap models easily.

The future isn't about one AI to rule them all. It's about using the right tool for the right job.

Now, go build something amazing while it's this cheap.

What are your go-to models? Have you found any clever cost-saving tricks?

🌐
Reddit
reddit.com › r/openai › looking at openai's model lineup and pricing strategy
r/OpenAI on Reddit: Looking at OpenAI's Model Lineup and Pricing Strategy
March 20, 2025 -

Well, I've been studying OpenAI's business moves lately. They seem to be shifting away from their open-source roots and focusing more on pleasing investors than regular users.

Looking at this pricing table, we can see their current model lineup:

  • o1-pro: A beefed-up version of o1 with more compute power

  • GPT-4.5: Their "largest and most capable GPT model"

  • o1: Their high-intelligence reasoning model

The pricing structure really stands out:

  • o1-pro output tokens cost a whopping $600 per million

  • GPT-4.5 is $150 per million output tokens

  • o1 is relatively cheaper at $60 per million output tokens

Honestly, that price gap between models is pretty striking. The thing is, input tokens are expensive too - $150 per million for o1-pro compared to just $15 for the base o1 model.

So, comparing this to competitors:

  • Deepseek-r1 charges only around $2.50 for similar output

  • The qwq-32b model scores better on benchmarks and runs on regular computers

The context window sizes are interesting too:

  • Both o1 models offer 200,000 token windows

  • GPT-4.5 has a smaller 128,000 token window

  • All support reasoning tokens, but have different speed ratings

Basically, OpenAI is using a clear market segmentation strategy here. They're creating distinct tiers with significant price jumps between each level.

Anyway, this approach makes more sense when you see it laid out - they're not just charging high prices across the board. They're offering options at different price points, though even their "budget" o1 model is pricier than many alternatives.

So I'm curious - do you think this tiered pricing strategy will work in the long run? Or will more affordable competitors eventually capture more of the market?

🌐
Reddit
reddit.com › r/localllama › cost comparisons between openai, mistral, claude and gemini
r/LocalLLaMA on Reddit: Cost comparisons between OpenAi, Mistral, Claude and Gemini
January 9, 2024 -

Update: This is now out of date. Please use a dedicated cost calculator like llmpricecheck.com or https://docsbot.ai/tools/gpt-openai-api-pricing-calculator

I compiled a breakdown of cost/performance in a google sheet and there were a couple of things that struck me:

  1. Mistral-medium is really impressive and sits perfectly sandwiched between GPT-3.5 and GPT-4. In my (limited) experience it's a great choice for anyone that isn't able to get consistency or quality out of GPT-3.5.

  2. Why would anyone choose Claude? No but seriously, what are their competitive advantage? Safety?

  3. GPT-3.5 v Gemini Pro seems very close but in my (limited) experience GPT-3.5 does perform better in practice. I'd be curious what other peoples experiences are.

Google sheet is here, please let me know if I bungled any of the numbers.

  • All prices are normalized to USD/1M tokens. (EUR -> USD conversion value as of today)

  • I only used the LMSYS chatbot arena benchmark because all the rest have seemingly been gamified already.

  • The sheet includes an example estimate calculator, just copy the sheet to override the values try it.

Edit: Added Mistral-7B-OpenOrca, Mixtral-8x7B-Instruct-v0.1 and Llama-2-70b-chat-hf running on Anyscale.

🌐
Reddit
reddit.com › r/chatgpt › openai's pricing insanity: gpt-4.5 costs 15x more than 4o while deepseek & google race ahead
r/ChatGPT on Reddit: OpenAI's pricing insanity: GPT-4.5 costs 15x more than 4o while DeepSeek & Google race ahead
March 21, 2025 -

Looks like we're about to add another item to Masayoshi Son's list of SoftBank funding failures. OpenAI just released the next version of their flagship LLM, and the pricing is absolutely mind-boggling.

GPT-4.5 vs GPT-4o:

  • Performance: Barely any meaningful improvement

  • Price: 15x more expensive than GPT-4o

  • Benchmark position: Still behind DeepSeek R1 and qwq32B

But wait, it gets worse. The new o1-Pro API costs a staggering $600 per million tokens - that's 300x the price of DeepSeek R1, which is already confirmed to be a 671B parameter model.

What exactly is Sam Altman thinking? Two years have passed since the original GPT-4 release, and what do we have to show for it?

All GPT-4.5 feels like is just a bigger, slightly smarter version of the same 2023 model architecture - certainly nothing that justifies a 15x price hike. We're supposed to be witnessing next-gen model improvements continuing the race to AGI, not just throwing more parameters at the same approach and jacking up prices.

After the original GPT-4 team left OpenAI, it seems they've accomplished little in actually improving the core model. Meanwhile:

  • Google is making serious progress with Gemini 2.0 Flash

  • DeepSeek is delivering better performance at a fraction of the cost

  • Claude continues to excel in many areas

Is OpenAI's strategy just "throw more computing at the problem and see what happens"? What's next? Ban DeepSeek? Raise $600B? Build nuclear plants to power even bigger models?

Don't be shocked when o3/GPT-5 costs $10k per API call and still lags behind Claude 4 in most benchmarks. Yes, OpenAI leads in some coding benchmarks, but many of us are using Claude for agent coding anyway.

TL;DR: OpenAI's new models cost 15-300x more than competitors with minimal performance improvements. The company that once led the AI revolution now seems to be burning investor money while competitors innovate more efficiently.

🌐
Reddit
reddit.com › r/singularity › with the insane prices of recent flagship models like gpt-4.5 and o1-pro, is openai trying to limit deepseek's use of its api for training?
r/singularity on Reddit: With the insane prices of recent flagship models like GPT-4.5 and O1-Pro, is OpenAI trying to limit DeepSeek's use of its API for training?
February 1, 2025 -

Look at the insane API price that OpenAI has put out, $600 for 1 million tokens?? No way, this price is never realistic for a model with benchmark scores that aren't that much better like o1 and GPT-4.5. It's 40 times the price of Claude 3.7 Sonnet just to rank slightly lower and lose? OpenAI is deliberately doing this – killing two birds with one stone. These two models are primarily intended to serve the chat function on ChatGPT.com, so they're both increasing the value of the $200 ChatGPT Pro subscription and preventing DeepSeek or any other company from cloning or retraining based on o1, avoiding the mistake they made when DeepSeek launched R1, which was almost on par with o1 with a training cost 100 times cheaper. And any OpenAI fanboys who still believe this is a realistic price, it's impossible – OpenAI still offers the $200 Pro subscription while allowing unlimited the use of o1 Pro at $600 per 1 million tokens, no way.If OpenAI's cost to serve o1 Pro is that much, even $200/day for ChatGPT Pro still isn't realistic to serve unlimited o1 Pro usage. Either OpenAI is trying to hide and wait for DeepSeek R2 before release their secret model (like GPT-5 and full o3), but they still have to release something in the meantime, so they're trying to play tricks with DeepSeek to avoid what happened with DeepSeek R1, or OpenAI is genuinely falling behind in the competition.

🌐
Reddit
reddit.com › r/homeassistant › those of you using openai as your llm, how much is it costing you each month?
r/homeassistant on Reddit: Those of you using OpenAI as your LLM, how much is it costing you each month?
March 13, 2025 -

EDIT: The answer appears to be "sign up to platform.openai.com instead of ChatGPT, because then you only get charged for the tokens you use, and not the $20/month ChatGPT charge"

Thanks to everyone who answered, I'm up and running, I'll feedback if it starts costing too much!

EDIT 2: Apparently google is too hard for a lot of people, so here's a FAQ for all of those who hijacked this for something else:

  1. Just read the docs on the OpenAI integration, it's all there, no hardware required unless you want to talk to it in which case you'll need one of the hardware voice assistants.

  2. I'm using it to make my smart home more intelligent - there are loads of examples on Youtube of what people are doing, I want to use OpenAI to do the same thing, so I followed the tutorials on there and got it working

Yes, this is blunt, yes, I think people should share knowledge, but I'm also not going to do your homework for you.

================

I don't have the money or the interest to spend on running a local LLM, so I want to run hosted.

I've noticed the OpenAI API is billed "per million tokens" rather than ChatGPT which is billed at $20USD/month, so I'm starting to work out how much it will cost me to run OpenAI as the backend for my HA setup.

Please note that I am only interested in hearing from people who are already running OpenAI with HA - if you're not doing this, I'm sure your project is awesome and if this doesn't work then I'll definitely be interested in what I should use instead, but right now I need this specific question answered.

Thanks in advance for your time!

Find elsewhere
🌐
Reddit
reddit.com › r/openai › "vision" model price comparison tool
r/OpenAI on Reddit: "Vision" model price comparison tool
April 21, 2024 -

I was having trouble figuring out which "vision" model was the most cost effective since they all calculate the pricing slightly differently. OpenAI does that weird 512x512 tile, while Claude converts resolution to tokens with a formula.

I had GPT4 whip up a very quick tool to do the comparison and it turned out great. I thought I'd just leave it here for others to use.

https://ansonlai.github.io/AI-Model-Price-Comparison/

🌐
Reddit
reddit.com › r/openai › api prices 🥴😩 | computer use | file search | web search
r/OpenAI on Reddit: API prices 🥴😩 | computer Use | file search | web search
March 11, 2025 - The compute is pretty reasonable at 3/12 for input/output tokens. File search is a bit more expensive at $2500, but probably only if you have a huge number of files and tons of searches.
🌐
Reddit
reddit.com › r/openai › openai api pricing
r/OpenAI on Reddit: Openai API pricing
April 25, 2024 -

Hello, i have been developing a side project that utilizes openai gpt4o latest api for its vision capabilities.

I am trying to make a cost analysis, my api requests pretty consistent with around 34k input and 2k output however the charges i am having varies very different.

I should be paying about 10.3 cents per request however it changes between 13-20 cents per request.

What am i doing wrong here ? Thanks.

🌐
Reddit
reddit.com › r/openai › chatgpt api pricing comparison
r/OpenAI on Reddit: ChatGPT API Pricing Comparison
July 25, 2022 - Anyone that doesn’t already have their own LLM will be integrating this into their services for support bots and anything else people can hook it up to. ... GPT-4.5 has an API price of $75/1M input and $150/1M output.
🌐
Reddit
reddit.com › r/openai › openai is reportedly considering high-priced subscriptions up to $2,000 per month for next-gen ai models
r/OpenAI on Reddit: OpenAI is reportedly considering high-priced subscriptions up to $2,000 per month for next-gen AI models
July 1, 2024 - The price does not cover the training but it does cover the inference. They are not even running any big models at this moment with gpt-4o. More replies More replies More replies More replies ... Your company won’t exist at all anymore. If a tech company like Google or OpenAI genuinely develops an AGI that can replaces all human labour and is smart enough to achieve any tasks on its own, that also replaces the need for any other company to exist.
🌐
Reddit
reddit.com › r/openai › can someone explain to me why the price of chatgpt+ in europe is the most expensive in the world, while most features are closed
r/OpenAI on Reddit: Can someone explain to me why the price of ChatGPT+ in Europe is the most expensive in the world, while most features are closed
July 22, 2025 -

I've just looked at the prices of ChatGPT+ around the world, and it's quite disturbing: Europe is quite simply the most expensive area for subscription, with around €23 to €25 per month, VAT included. However, many features are blocked with us — I am thinking in particular of options that are inaccessible for reasons or other reasons.

In comparison: • Türkiye: ~12€ • Brazil: ~15€ • United States: $20 without VAT • Nigeria: ~€6 (!)

And in the United Arab Emirates? ChatGPTPlus is… free for residents, via a local partnership.

I understand that there are adjustments depending on local taxation, but why charge more for a service... which offers less? 🤷‍♂️

🌐
Reddit
reddit.com › r/openai › it seems that openai’s inference costs easily eclipsed its revenues.
r/OpenAI on Reddit: It seems that OpenAI’s inference costs easily eclipsed its revenues.
November 12, 2025 -

Exclusive: Here's How Much OpenAI Spends On Inference and Its Revenue Share With Microsoft

According to the documents viewed by this newsletter, OpenAI spent $5.02 billion on inference alone with Microsoft Azure in the first half of Calendar Year CY2025.

This is a pattern that has continued through the end of September. By that point in CY2025 — three months later — OpenAI had spent $8.67 billion on inference.

OpenAI’s inference costs have risen consistently over the last 18 months, too. For example, OpenAI spent $3.76 billion on inference in CY2024, meaning that OpenAI has already doubled its inference costs in CY2025 through September.

Based on its reported revenues of $3.7 billion in CY2024 and $4.3 billion in revenue for the first half of CY2025, it seems that OpenAI’s inference costs easily eclipsed its revenues.

Yet, as mentioned previously, I am also able to shed light on OpenAI’s revenues, as these documents also reveal the amounts that Microsoft takes as part of its 20% revenue share with OpenAI.

Concerningly, extrapolating OpenAI’s revenues from this revenue share does not produce numbers that match those previously reported.

According to the documents, Microsoft received $493.8 million in revenue share payments in CY2024 from OpenAI — implying revenues for CY2024 of at least $2.469 billion, or around $1.23 billion less than the $3.7 billion that has been previously reported.

Similarly, for the first half of CY2025, Microsoft received $454.7 million as part of its revenue share agreement, implying OpenAI’s revenues for that six-month period were at least $2.273 billion, or around $2 billion less than the $4.3 billion previously reported. Through September, Microsoft’s revenue share payments totalled $865.9 million, implying OpenAI’s revenues are at least $4.329 billion.

According to Sam Altman, OpenAI’s revenue is “well more” than $13 billion. I am not sure how to reconcile that statement with the documents I have viewed.

🌐
Reddit
reddit.com › r/openai › openai is back in the ai race. a side-by-side comparison between deepseek r1 and openai o3-mini
r/OpenAI on Reddit: OpenAI is BACK in the AI race. A side-by-side comparison between DeepSeek R1 and OpenAI o3-mini
December 23, 2024 -

For the entire month of January, I’ve been an OpenAI hater.

I’ve repeatedly and publicly slammed them. I talked extensively about DeepSeek R1, their open-source competitor, and how a small team of Chinese researchers essentially destroyed OpenAI at their own game.

I also talked about Operator, their failed attempt at making a useful “AI agent” that can perform tasks fully autonomously.

However, when Sam Altman declared that they were releasing o3-mini today, I thought it would be another failed attempt at stealing the thunder from actual successful AI companies. I was 110% wrong. O3-mini is BEYOND amazing.

What is O3-mini?

OpenAI’s o3-mini is their new and improved Large Reasoning Model.

Unlike traditional large language models which respond instantly, reasoning models are designed to “think” about the answer before coming up with a solution. And this process used to take forever.

For example, when I integrated DeepSeek R1 into my algorithmic trading platform NexusTrade, I increased all of my timeouts to 30 minutes... for a single question.

Pic: My application code polls for a response for approximately 30 minutes

However, OpenAI did something incredible. Not only did they make a reasoning model that’s cheaper than their previous daily usage model, GPT-4o...

Pic: The cost of GPT-4o vs. OpenAI o3-mini

And not only is it simultaneously more powerful than their previous best model, O1...

Pic: O3 is better at PhD-level science questions than O1-preview, O1, and O1-mini

BUT it’s also lightning fast. Much faster than any reasoning model that I’ve ever used by far.

And, when asked complex questions, it answers them perfectly, even better than o1, DeepSeek’s R1, and any other model I’ve ever used.

So, I thought to benchmark it. Let’s compare OpenAI’s o3 to the hottest language model of January, DeepSeek R1.

A side-by-side comparison of DeepSeek R1 and OpenAI o3-mini

We’re going to do a side-by-side comparison of these two models for one complex reasoning task: generating a complex, syntactically-valid SQL query.

We’re going to compare these models on the basis of:

  • Accuracy: did the model generate the correct response?

  • Latency: how long did the model take to generate its response?

  • Cost: approximately, which model cost more to generate the response?

The first two categories are pretty self-explanatory. Here’s how we’ll compare the cost.

We know that DeepSeek R1 costs $0.75/M input tokens and $2.4/M output tokens.

Pic: The cost of R1 from OpenRouter

In comparison, OpenAI’s o3 is $1.10/M input tokens and $4.4/M output tokens.

Pic: The cost of O3-mini from OpenAI

Thus, o3-mini is approximately 2x more expensive per request.

However, if the model generates an inaccurate query, there is automatic retry logic within the application layer.

Thus, to compute the costs, we’re going to see how many times the model retries, count the number of requests that are sent, and create an estimated cost metric. The baseline cost for R1 will be c, so at no retries, because o3-mini costs 2c (because it’s twice as expensive).

Now, let’s get started!

Using LLMs to generate a complex, syntactically-valid SQL query

We’re going to use an LLM to generate syntactically-valid SQL queries.

This task is extremely useful for real-world LLM applications. By converting plain English into a database query, we change our interface from buttons and mouse-clicks into something we can all understand – language.

How it works is:

  1. We take the user’s request and convert it to a database query

  2. We execute the query against the database

  3. We take the user’s request, the model’s response, and the results from the query, and ask an LLM to “grade” the response

  4. If the “grade” is above a certain threshold, we show the answer to the user. Otherwise, we throw an error and automatically retry.

Let’s start with R1. Let’s start with R1

For this task, I’ll start with R1. I’ll ask R1 to show me strong dividend stocks. Here’s the request:

Show me large-cap stocks with:

  • Dividend yield >3%

  • 5 year dividend growth >5%

  • Debt/Equity <0.5

I asked the model to do this two separate times. In both tests, the model either timed out or didn’t find any stocks.

Pic: The query generated from R1

Just from manual inspection, we see that:

  • It is using total liabilities, (not debt) for the ratio

  • It’s attempting to query for the full year earnings, instead of using the latest quarter

  • It’s using an average dividend yield for a trailing twelve month dividend figure

Finally, I had to check the db logs directly to see the amount of time elapsed.

Pic: Screenshots of the chat logs in the database

These logs show that the model finally gave up after 41 minutes! That is insane! And obviously not suitable for real-time financial analysis.

Thus, for R1, the final score is:

  • Accuracy: it didn’t generate a correct response = 0

  • Cost: with 5 retry attempts, it costs 5c + 1c = 6c

  • Latency: 41 minutes

It’s not looking good for R1...

Now, let’s repeat this test with OpenAI’s new O3-mini model.

Next is O3

We’re going to ask the same exact question to O3-mini.

Unlike R1, the difference in speed was night and day.

I asked the question at 6:26PM and received a response 2 minutes and 24 seconds later.

Pic: The timestamp in the logs from start to end

This includes 1 retry attempt, one request to evaluate the query, and one request to summarize the results.

In the end, I got the following response.

Pic: The response from the model

We got a list of stocks that conform to our query. Stocks like Conoco, CME Group, EOG Resources, and DiamondBack Energy have seen massive dividend growth, have a very low debt-to-equity, and a large market cap.

If we click the “info” icon at the bottom of the message, we can also inspect the query.

Pic: The query generated from O3-mini

From manual inspection, we know that this query conforms to our request. Thus, for our final grade:

  • Accuracy: it generated a correct response = 1

  • Cost: 1 retry attempt + 1 evaluation query + 1 summarization query = 3c * 2 (because it’s twice as expensive) = 6c

  • Latency: 2 minutes, 24 seconds

For this one example, we can see that o3-mini is better than r1 in every way. It’s many orders of magnitude faster, it costs the same, and it generated an accurate query to a complex financial analysis question.

To be able to do all of this for a price less than its last year daily-usage model is absolutely mindblowing.

Concluding Thoughts

After DeepSeek released R1, I admit that I gave OpenAI a lot of flak. From being extremely, unaffordably expensive to completely botching Operator, and releasing a slow, unusable toy masquerading as an AI agent, OpenAI has been taking many Ls in the month of January.

They made up for ALL of this with O3-mini.

This model put them back in the AI race at a staggering first place. O3-mini is lightning fast, extremely accurate, and cost effective. Like R1, I’ve integrated it for all users of my AI-Powered trading platform NexusTrade.

This release shows the exponential progress we’re making with AI. As time goes on, these models will continue to get better and better for a fraction of the cost.

And I’m extremely excited to see where this goes.

This analysis was performed with my free platform NexusTrade. With NexusTrade, you can perform comprehensive financial analysis and deploy algorithmic trading strategies with the click of a button.

Sign up today and see the difference O3 makes when it comes to making better investing decisions.

Pic: Perform financial research and deploy algorithmic trading strategies

Top answer
1 of 4
33
…never left.
2 of 4
17
Hmm that is an interesting figure that people aren't considering when comparing numbers, so it turns out we are still comparing apples to oranges. Right now people would look at $1.10 / million input + $4.40 / million output for o3 mini and compare it with Deepseek R1's $0.14 and $2.19 and conclude R1 is significantly cheaper the comparison given in AIExplained's video . Not only are the numbers different from other providers like your OpenRouter numbers (because IIRC they have different max context lengths), but IIRC $0.14 was Deepseek V3's time limited discount price. And THEN factor in - well what if it takes R1 (or o1) a million tokens to get the correct answer, but o3 mini takes 200k tokens? Comparing price per token made sense when we were talking about regular base models like 4o, Sonnet, Deepseek V3, Llama 3, etc, because the amount of tokens outputted would be similar across all models, but that is no longer true for reasoning models. I could charge $0.01 per million tokens for output and take 10 million tokens to get to the correct answer. Or I could charge $0.10 per million tokens and it takes 1 million tokens for the correct answer. Or I could charge $1 per million tokens and it only takes 100k tokens for the correct answer. All 3 would actually cost the exact same, but at first glance it would appear that the $0.01 model is cheaper than the $1 model even if it's not true. There is currently a lack of a standard in comparing model costs.
🌐
Reddit
reddit.com › r/openai › realtime api is still too expensive, how do you stay profitable?
r/OpenAI on Reddit: Realtime API is still too expensive, how do you stay profitable?
July 5, 2025 -

I'm trying to build a voice agent for a B2C and I never realized how expensive it is. I can get it's easy to be profitable for B2B agents since you reduce payroll(s), but I don't get how this could be profitable for B2C.

Do you charge per usage or just price it very expensive?