🌐
OpenAI
platform.openai.com › docs › guides › latest-model
Using GPT-5.2 | OpenAI API
2 weeks ago - When generating code with GPT-5.2, medium and high verbosity levels yield longer, more structured code with inline explanations, while low verbosity produces shorter, more concise code with minimal commentary. ... 1 2 3 4 5 6 7 curl --request POST --url https://api.openai.com/v1/responses --header "Authorization: Bearer $OPENAI_API_KEY" --header 'Content-type: application/json' --data '{ "model": "gpt-5", "input": "What is the answer to the ultimate question of life, the universe, and everything?", "text": { "verbosity": "low" } }'
🌐
OpenRouter
openrouter.ai › openai › gpt-5-mini
GPT-5 Mini - API, Providers, Stats | OpenRouter
August 7, 2025 - GPT-5 Mini is the successor to OpenAI's o4-mini model. ... Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.
Discussions

GPT-5-mini is very slow compared to GPT-4.1-mini. What is the upgrade path?
It's a non-reasoning model compared to a reasoning model, naturally the model that reasons is going to take more time to compute right? Either use minimal reasoning for the fastest responses, low for a bit better of a response while still being fast, or simply keep using 4.1-mini if it's working fine for you. No need to switch if you don't have a reason to. If you can't tell if the answers are better then why don't you check if they are first before worrying about upgrading? More on reddit.com
🌐 r/OpenAI
24
17
August 29, 2025
There is no way to use gpt-5 through API without giving away biometric information.
Welcome to the safety hell era of civilization. It's going to get way worse. More on reddit.com
🌐 r/singularity
10
22
August 9, 2025
Why is NOBODY talking about just how amazing GPT-5-mini is??
✅ u/TheReaIIronMan , your post has been approved by the community! Thanks for contributing to r/ChatGPTPro — we look forward to the discussion. More on reddit.com
🌐 r/ChatGPTPro
64
16
June 16, 2025
GPT-4.5 has an API price of $75/1M input and $150/1M output. ChatGPT Plus users are going to get 5 queries per month with this level of pricing.
This is the the kind of pricing you'd offer for something you didn't really want people using. More on reddit.com
🌐 r/OpenAI
294
928
February 28, 2025
🌐
OpenAI
openai.com › index › introducing-gpt-5
Introducing GPT-5 | OpenAI
GPT‑5 is the new default in ChatGPT, replacing GPT‑4o, OpenAI o3, OpenAI o4-mini, GPT‑4.1, and GPT‑4.5 for signed-in users. Just open ChatGPT and type your question; GPT‑5 handles the rest, applying reasoning automatically when the ...
🌐
AI/ML API
docs.aimlapi.com › api-references › text-models-llm › openai › gpt-5-mini
gpt-5-mini | AI/ML API Documentation
1 month ago - import requests import json # for getting a structured output with indentation response = requests.post( "https://api.aimlapi.com/v1/chat/completions", headers={ # Insert your AIML API Key instead of <YOUR_AIMLAPI_KEY>: "Authorization":"Bearer <YOUR_AIMLAPI_KEY>", "Content-Type":"application/json" }, json={ "model":"openai/gpt-5-mini-2025-08-07", "messages":[ { "role":"user", "content":"Hello" # insert your prompt here, instead of Hello } ] } ) data = response.json() print(json.dumps(data, indent=2, ensure_ascii=False))
🌐
Replicate
replicate.com › openai › gpt-5-mini
OpenAI GPT-5-mini
August 7, 2025 - From older models: Use gpt-5 for o3 and gpt-4.1 tasks. Start with medium or minimal reasoning depending on complexity. From Chat Completions: Switch to the Responses API to support reasoning carryover (CoT).
🌐
CometAPI
cometapi.com › gpt-5-mini-api
GPT-5 mini API - CometAPI - All AI Models in One API
August 8, 2025 - Endpoint: https://api.cometapi.com/v1/chat/completions · Model Parameter: “gpt-5-mini“ / “gpt-5-mini-2025-08-07“
🌐
GitHub
github.blog › home › changelogs › openai gpt-5 and gpt-5 mini are now generally available in github copilot
OpenAI GPT-5 and GPT-5 mini are now generally available in GitHub Copilot - GitHub Changelog
September 9, 2025 - GPT-5 mini is available to all GitHub Copilot plans, including Copilot Free, while GPT-5 is available only to paid Copilot plans.
Find elsewhere
🌐
OpenRouter
openrouter.ai › openai › gpt-5-mini › api
OpenAI: GPT-5 Mini – Run with an API
August 7, 2025 - GPT-5 Mini is the successor to OpenAI's o4-mini model. ... OpenRouter provides an OpenAI-compatible completion API to 400+ models & providers that you can call directly, or using the OpenAI SDK.
🌐
OpenAI
platform.openai.com › docs › models › gpt-5-mini
GPT-5 mini Model | OpenAI API
GPT-5 mini is a faster, more cost-efficient version of GPT-5. It's great for well-defined tasks and precise prompts. Learn more in our GPT-5 usage guide. ... Pricing is based on the number of tokens used, or other metrics based on the model type.
🌐
Reddit
reddit.com › r/openai › gpt-5-mini is very slow compared to gpt-4.1-mini. what is the upgrade path?
r/OpenAI on Reddit: GPT-5-mini is very slow compared to GPT-4.1-mini. What is the upgrade path?
August 29, 2025 -

Hey all, excuse me if I am being dense here, but I am genuinely confused about what is the obvious upgrade path for gpt-4.1-mini now that the GPT-5 family is out. I'll just say at the outset that the caveat to this whole post is that I am not evaluating answer quality at all, just the latency.

I run a small, free service that returns AI answers to Canadian tax questions: TaxGPT.ca. It's a little hobby project for me, and it gets a reasonable amount of usage.

I use gpt-4.1-mini to answer most questions. I have a RAG pipeline, which means I feed it documents from our tax agency. Users will ask questions, I augment their questions with some tax info, and then I send all of it to OpenAI and get a response. When I hit my API directly, it returns answers in around ~8 seconds. Not exactly lighting fast, but it seems like a nice balance of accuracy vs speed.

Now that GPT-5* is out, I figured the obvious upgrade would be to switch from gpt-4.1-mini to gpt-5-mini. But when trying it out as a drop-in replacement, I am finding the response times are much slower, like ~13 seconds. The answers might be better (hard to tell), but they are definitely slower (easy to tell).

I spun up a little demo app to record response times for different API calls for a simple conversation with a system message and 3 'turns'.

Since my app is currently using the completions API route, the easiest change for me is just to change the model name. But since gpt-5-mini is a reasoning model, if you use the newer API route, you can dial the 'effort' variable up or down. So I recorded response times for both the old and new API routes, including all levels of reasoning effort.

High-level results are this:

Completions API (legacy)

  • 4.1-mini average response time: 1.1 s

  • 5-mini average response time: 7.7 s

Responses API (includes reasoning "effort")

  • 4.1-mini average response time: 1.2 s

  • 5-mini average response time, minimal effort: 2.2 s

  • 5-mini average response time, low effort: 3.8 s

  • 5-mini average response time, medium effort (default): 7.9 s

  • 5-mini average response time, high effort: 25.7 s

It leaves me confused about the intended upgrade path here for people using the last mini model.

It seems crazy that the default call to 5.1-mini it takes almost 7 times longer in this simple example when using the older API (admittedly, the proportional increase is less dramatic in real-world usage, but that's probably to do with the rest of my answering pipeline).

Is the idea that gpt-5.1-mini with "low" or even "minimal" reasoning is the better bet here?

I understand that everything depends on context (ha), so you should tune for your use case, but the most straightforward approach would be to just change the model name, which makes the latency jump by a huge margin.

If you boil it down to Steve Jobs’, "Which ones do I tell my friends to buy?", which one am I supposed to use?

I almost feel like the answer right now is "don't upgrade."

Full results

Each run includes this 4-message conversation.

- system: You have a Canadian accent. Respond in 1 sentence.
- user: "what is the population of quebec city
- assistant: Quebec City, eh? I think it's about 550000 or so bud.
- user: what should i do there?

Completions API route (legacy)

ModelRun 1Run 2Run 3Run 4Run 5avg time (ms)
4.1 mini97417398238669711075
5 mini979078265917743375817709

Responses API route (includes reasoning "effort")

ModelEffortRun 1Run 2Run 3Run 4Run 5avg time (ms)
4.1 miniN/A16561351800119611571232
5 miniminimal243920932437191018762151
5 minilow428730663922314747463834
5 minimedium7083914469525245108447854
5 minihigh209884281218342232252293125660
🌐
Analytics Vidhya
analyticsvidhya.com › home › how to access gpt-5 via api?
How to Access GPT-5 via API?
August 30, 2025 - First, set up your API credentials: if you want to use OPENAI_API_KEY as an environmental variable. Then install, or upgrade, the OpenAI SDK to use GPT-5. From there, you can call the GPT-5 models (gpt-5, gpt-5-mini, gpt-5-nano) like any other model through the API.
🌐
OpenAI
platform.openai.com › docs › models › gpt-5
GPT-5 Model | OpenAI API
November 13, 2025 - Batch API price · Input · $1.25 · Cached input · $0.125 · Output · $10.00 · Quick comparison · Input · Cached input · Output · GPT-5 · $1.25 · GPT-5 mini · $0.25 · GPT-5 nano · $0.05 · Modalities · Text · Input and output · Image · Input only ·
🌐
Benjamin Crozat
benjamincrozat.com › gpt-5-api
GPT‑5: my API quick start guide
September 29, 2025 - Use mini when cost matters and answers can be shorter. Use nano for the lowest latency. Remember the limits: 272,000 input tokens plus up to 128,000 output tokens, for 400,000 tokens total.
🌐
Apidog
apidog.com › blog › gpt-5-api
How to use GPT 5 API ?
August 8, 2025 - Leverage Variants: Choose gpt-5-mini ($0.25/1M input tokens, $2/1M output tokens) or gpt-5-nano ($0.05/1M input tokens, $0.40/1M output tokens) for cost-sensitive tasks. The full gpt-5 model costs $1.25/1M input tokens and $10/1M output tokens. Monitor Token Usage: Track usage in responses to stay within budget. Refer to OpenAI’s pricing page for details. Test with Apidog: Run small-scale tests to optimize prompts before scaling.
🌐
OpenAI
platform.openai.com › docs › models
Models | OpenAI API
2 weeks ago - Version of GPT-5.2 that produces smarter and more precise responses.
🌐
OpenAI
platform.openai.com › docs › models › gpt-5-pro
GPT-5 pro Model | OpenAI API
October 6, 2025 - GPT-5 pro is available in the Responses API only to enable support for multi-turn model interactions before responding to API requests, and other advanced API features in the future. Since GPT-5 pro is designed to tackle tough problems, some requests may take several minutes to finish.