Videos
Translating a word from the alphabet of one language to another(get the pronuntiation), is named transliteration and it is not supported by Cloud Translate API yet.
However, there is a feature request filed for the same. You can vote for this feature by clicking "+1" and "STAR" mark to recieve updates on it.
I was also looking for the same answer. I haven't found a way to get Pronuntiation with the Google Translate API yet. However, I found a way to get it through the Python library. You can use the Python googletrans library. The code is as follows.
You must use googletrans library 3.1.0a0 to avoid errors.
# pip install googletrans==3.1.0a0
from googletrans import Translator
translator = Translator()
SENTENSE = "안녕하세요. 반갑습니다."
LANGUAGE_CODE = translator.detect(SENTENSE).lang
k = translator.translate(SENTENSE, dest=LANGUAGE_CODE)
print(k)
print(k.text)
print(k.pronunciation)
Output:
Translated(src=ko, dest=ko, text=안녕하세요. 반갑습니다., pronunciation=annyeonghaseyo. bangabseubnida., extra_data="{'translat...")
안녕하세요. 반갑습니다.
annyeonghaseyo. bangabseubnida.
I never see anyone talking about this but it's so easy. Google Translate allows you to record your voice and it guesses what you are saying. Perhaps if you change services like Siri or Alexa to your target language it would also work.
I was having trouble pronouncing the Russian word for hello, Здравствуйте. The Latin transliteration of this is zdravstvuyte. Note that this word looks like pure nonsense, to me anyway. But by using Google Translate I was forced to pronounce it accurately for the program to recognize the word. Then I could play back their recording of a native speaker saying the word and practice it. All you have to do is Google "translate Russian to English" and this pops up.
Of course it's not as good as having a native speaker listen to your speaking but it's very convenient.
Since this question was asked, it's gotten much harder to "scrape" MP3s from Google Translate, but Google has (finally) set up a TTS API. Interestingly it is billed in input characters, with the first 1 or 4 million input characters per month being free (depending on whether you use WaveNet or old school voices)
Nowadays to do this using gcloud on the command line (versus building this into an app) you would do roughly as follows (I'm paraphrasing the TTS quick start). You need base64, curl, gcloud, and jq for this walkthrough.
- Create a project on the GCP console, or run something like
gcloud projects create example-throwaway-tts - Enable billing for the project. Do this even if you don't intend to exceed the freebie quota.
- Use the GCP console to enable the TTS API for the project you just set up.
- Use the console again, this time to make a new service account.
- Use any old name
- Don't give it a role. You'll get a warning. This is okay.
- Select key type JSON if it isn't already selected
- Click
Create - Hold onto the JSON file that your browser downloads
- Set an environment variable to point at that file, e.g.
export GOOGLE_APPLICATION_CREDENTIALS="~/Downloads/service-account-file.json" - Get the appropriate access token:
- Tell
gcloudto use that new project:gcloud config set project example-throwaway-tts - Set a variable
TTS_ACCESS_TOKEN=gcloud auth application-default print-access-token
- Tell
- Put together a JSON request. I'll give an example below. For this example we'll call it
request.json Lastly, run the following
curl \ -H "Authorization: Bearer "$TTS_ACCESS_TOKEN \ -H "Content-Type: application/json; charset=utf-8" \ --data-raw @request.json \ "https://texttospeech.googleapis.com/v1/text:synthesize" \ | jq '.audioContent' \ | base64 --decode > very_simple_example.mp3
What this does is to
- authenticate using the default access token for the project you set up
- set the content type to JSON (so that
jqcan extract the payload) - use
request.jsonas the data to send usingcurl's--data-rawflag - extract the value of
audioContentfrom the response base64decode that content- save the whole mess as an MP3
Contents of request.json follow. You can see where to insert your desired text, adjust the voice or change output formats via audioConfig:
{
'input':{
'text':'very simple example'
},
'voice':{
'languageCode':'en-gb',
'name':'en-GB-Standard-A',
'ssmlGender':'FEMALE'
},
'audioConfig':{
'audioEncoding':'MP3'
}
}
Original Answer
As Hugolpz alludes, if you know the word or phrase you want (via a previous Translate API call), you can get MP3s from a URL like http://translate.google.com/translate_tts?ie=UTF-8&q=Bonjour&tl=fr
Note that &tl=fr ensures that you get French instead of the default English.
You will need to rate-limit yourself, but if you're looking for a small number of words or phrases you should be fine.
Similar functionality is provided by the Speech Synthesis API (under development). Third-party libraries are already there, such as ResponsiveVoice.JS.
Right click the page and select "Inspect Element", then go to the network tab. Now, refresh the page with the network panel still open. Wait until nothing is showing up there anymore. While waiting, make sure not to get your mouse near the Listen button. Once nothing is showing up in the network panel, hover and click the listen button. As soon as you hover the listen button, an entry will appear that says "batchexecute". Find this entry. It should be above entries that say log?format=json&hasfast=….
Click on that and then on the right select the "Response" tab. There should be a bunch of random characters that go off the screen very far to the right
Select just that text and copy it. The easiest way to do this is to scroll all the way to the right first and then click and hold to the right of the ending quotation mark, then move your mouse up to the line above, then move your mouse down to reach the starting quotation mark, holding the mouse the whole time.
Go to the console tab and type v= then paste then press enter. Then, paste this into the console and press enter
{
const a = document.createElement("a");
a.href = "data:audio/mp3;base64,"+JSON.parse(v)[0];
a.download = "file.mp3";
a.click();
}
The mp3 file will download.
- Google search the word of which you want to download pronunciation by entering the query :"*How to pronounce *word**"
- Right-click the page and click View page source.
- Search for Mp3. screenshot
- click the mp3 link.
- Click the 3 dots and click Download.
How do translating services know when a word is a name? I sorta get it between Latin languages where some names might not have direct translations, but for languages like Mandarin Chinese where the name's characters have literal meanings, how does the translator know not to translate it directly? I feel like it would be super hard to code the translator to recognize all possible names, since like in Chinese there are so many different characters that can be in a name.