openai whisper speech to text

1 2 3 4 5 6 curl --request POST \ --url https://api.openai.com/v1/audio/translations \ --header "Authorization: Bearer $OPENAI_API_KEY" \ --header 'Content-Type: multipart/form-data' \ --form file=@/path/to/file/german.mp3 \ --form model=whisper-1 \ In this case, the inputted audio was german and the outputted text looks like:

OpenAI

openai.com › index › whisper

Introducing Whisper | OpenAI

About a third of Whisper’s audio dataset is non-English, and it is alternately given the task of transcribing in the original language or translating to English. We find this approach is particularly effective at learning speech to text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot.

Videos

m.youtube.com

OpenAI API Python Whisper TTS: Tutorial | Audio ...

m.youtube.com

Build a voice assistant with OpenAI Whisper and TTS (text to ...

06:12

YouTube

How to Transcribe Audio to Text on Your Own PC with OpenAI Whisper ...

July 17, 2024

06:08

YouTube

Transcribe Audio Files for Free Using OpenAI Whisper - YouTube

Master OpenAI's Text-to-Speech & Speech-to-Text: Ultimate Tutorial ...

github.com › openai › whisper

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

Starred by 92.1K users

Forked by 11.5K users

Languages Python

reddit.com › r/openai › ways to use whisper for speech-to-text

r/OpenAI on Reddit: Ways to use Whisper for speech-to-text

February 5, 2024 -

Hello all! I've been using a great speech-to-text feature on the OpenAI website. I go to this link, click on a green microphone icon, and then upload audio files from my computer. It works really well for converting speech to text. But recently, I saw a message saying that the current method I use is legacy and suggesting I use a new method at this other link. The problem is that uploading audio isn't available in the chat Playground. Thus, I'm worried that soon I won't be able to upload audio the way I do now.

Does anyone know another method to convert speech to text that doesn't use complete mode on Playground? Thanks for any help!

Top answer

1 of 4

I'm late to the game. But if the Whisper tutorial is too complicated for you (like it was for me) I found a dead-simple way to transcribe a file. This is after hours fighting with ChatGPT and Co-pilot. Just drop it into the dictate function in Word, and it'll spit out a time-stamped, speaker-recognizing transcript. https://support.microsoft.com/en-us/office/transcribe-your-recordings-7fc2efec-245e-45f0-b053-2a97531ecf57

2 of 4

I want to use this speech to text function on chatGPT, cause I'm visually impaired. Is this method safe for writers?

freeCodeCamp

freecodecamp.org › news › how-to-turn-audio-to-text-using-openai-whisper

How to Turn Audio to Text using OpenAI Whisper

March 5, 2024 - Do you know what OpenAI Whisper is? It’s the latest AI model from OpenAI that helps you to automatically convert speech to text. Transforming audio into text is now simpler and more accurate, thanks to OpenAI’s Whisper.

Microsoft Learn

learn.microsoft.com › en-us › azure › ai-foundry › openai › whisper-quickstart

Convert speech to text with Azure OpenAI in Microsoft Foundry Models - Azure OpenAI | Microsoft Learn

Learn how to use the Azure OpenAI Whisper model for speech to text conversion.

Hugging Face

huggingface.co › openai › whisper-large-v3

openai/whisper-large-v3 · Hugging Face

Whisper is a Transformer based encoder-decoder model, also referred to as a sequence-to-sequence model. There are two flavours of Whisper model: English-only and multilingual. The English-only models were trained on the task of English speech recognition. The multilingual models were trained simultaneously on multilingual speech recognition and speech translation.

Microsoft Learn

learn.microsoft.com › en-us › azure › ai-services › speech-service › whisper-overview

The Whisper model from OpenAI - Foundry Tools | Microsoft Learn

In this article, you learn about the Whisper model from OpenAI that you can use for speech to text and speech translation.

Whisperui

whisperui.com

Affordable Speech to Text powered by OpenAI Whisper | WhisperUI

You can transform audio files into text and SRT files by using OpenAI Whisper Speech to Text.

Find elsewhere

Google Bing Mojeek

Microsoft Learn

learn.microsoft.com › en-us › azure › ai-services › openai › whisper-quickstart

Speech to text with the Azure OpenAI Whisper model

This quickstart explains how to use the Azure OpenAI Whisper model for speech to text conversion.

Replicate

replicate.com › openai › whisper

Whisper AI by Open AI - Run with an API on Replicate

openai · / whisper · Convert speech in audio to text · Public · 142.7M runs · License · GitHub · Weights · Paper · Playground API Examples README Versions · This model costs approximately $0.0061 to run on Replicate, or 163 runs per $1, but this varies depending on your inputs.

GitHub

github.com › openai › whisper › discussions › 608

Really Real Time Speech To Text · openai/whisper · Discussion #608

Hi, all. I recommend Whisper-Streaming for really real-time speech-to-text. There's a self-adaptive latency policy -- based on the actual complexity of the source.

Author openai

OpenAI

platform.openai.com › docs › api-reference › audio

Audio | OpenAI API Reference

The translated text. ... 1 2 3 4 5 curl https://api.openai.com/v1/audio/translations \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: multipart/form-data" \ -F file="@/path/to/file/german.m4a" \ -F model="whisper-1" 1 2 3 4 5 6 7 8 from openai import OpenAI client = OpenAI() audio_file = open("speech.mp3", "rb") transcript = client.audio.translations.create( model="whisper-1", file=audio_file )

Hugging Face

huggingface.co › spaces › openai › whisper › discussions › 76

openai/whisper · Can Whisper be used for real-time speech to text?

I had the same problem and so I’ve ... https://github.com/alesaccoia/VoiceStreamAI ... Upload images, audio, and videos by dragging in the text input, pasting, or clicking here....

OpenAI

openai.com › index › introducing-our-next-generation-audio-models

Introducing next-generation audio models in the API | OpenAI

With these new audio models, developers ... text-to-speech voices—all within the API. ... We’re introducing new gpt-4o-transcribe and gpt-4o-mini-transcribe models with improvements to word error rate and better language recognition and accuracy, compared to the original Whisper ...

YouTube

youtube.com › watch

How to Use OpenAI's Whisper for Perfect Transcriptions (Speech to Text) - YouTube

08:14

In this step-by-step tutorial, I show you how to use OpenAI's Whisper AI to get incredibly accurate transcriptions from any audio or video file, completely f...

Published October 8, 2025

Towards Data Science

towardsdatascience.com › home › latest › use openai whisper for automated transcriptions

Use OpenAI Whisper for Automated Transcriptions | Towards Data Science

June 26, 2025 - Models that can both transcribe (speech -> text), speech synthesis (text -> speech), and also speech-to-speech, where you have a whole conversation with a language model, with audio going both in and out. The arcitecture and and training pipeline for OpenAI’s Whisper model.

DataCamp

datacamp.com › tutorial › converting-speech-to-text-with-the-openAI-whisper-API

Speech to Text Made Easy with the OpenAI Whisper API | DataCamp

April 20, 2023 - Transcription is a process of converting spoken language into text. In the past, it was done manually, and now we have AI-powered tools like Whisper that can accurately understand spoken language.

OpenAI

platform.openai.com › docs › guides › speech-to-text › quickstart

Speech to text

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

Gladia

gladia.io › blog › what-is-openai-whisper

Gladia - What is OpenAI Whisper?

In terms of features, it provides speech-to-text transcription and translation to English, without additional audio intelligence features like speaker diarization, summarization, or other.