AI translator recognises any speech and converts it to English

A tool to understand everyone. AI giant OpenAI has created a new tool to translate any speech into English. Called Whisper, this AI translator could be the future of allowing everyone to understand each other.

OpenAI’s Whisper AI translator

Trained on around 680,000 hours of audio and transcripts, Whisper is an advanced AI translation service. Currently, the program is able to recognise 98 different languages and translate them into English.

The tool is an “encoder-decoder transformer”. This means that the software is not only able to detect words, but also determine the context behind them. As so many languages rely on context for certain phrases, this is integral for accurate translation.

In an explanation of the software, via Ars, OpenAI explains how the technology works. It explained:

“Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder. A decoder is trained to predict the corresponding text caption, intermixed with special tokens that direct the single model to perform tasks such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation.”

It’s worth noting that the AI translator does more work in real-time… yet. In its early state, the technology can only work by processing pre-prepared audio. Hopefully, in the future, it will be able to translate ongoing conversations.

According to Ars, Whisper is much better than other AI-powered translation services. However, it does take a while for the software to work on mid-spec hardware. If you’re planning on using the tech frequently, make sure you have a beefy PC.

A new genre of AI

Translation through artificial intelligence is becoming a popular use for the technology. While the idea of an AI translator is still in its infancy, more and more companies are attempting to create their own versions of this tech.

For example, Facebook parent company Meta is aiming to create a much more advanced version of this tech. Dubbed Universal Translator — after Star Trek — Meta is aiming to have real-time translations for every language into every other language.

For Meta, this tech is a powerful tool for its virtual world, aka The Metaverse. As thousands of people from all over the world are expected to talk with actual speech, the Universal Translator is designed to real-time translate all speech to every avatar.

However, Meta’s toll does seem to be far too advanced for the current generation of technology, even with its supercomputer. Will it ever release in its intended form? Only time will tell.

Currently, OpenAI’s translation software is available for free. If you wish to try out the software on your PC, you can download it here from GitHub.

