OpenAI offers 3 new AI voice models that the creator of ChatGPT says will “unlock a new class of voice applications for developers.”

OpenAI launched three new artificial intelligence (AI) models
They are intended for real-time voice tasks: reasoning, translation and transcription
Each is designed to be integrated into developers’ AI applications

If you are a regular ChatGPT user, you may know that you don’t need to interact with the artificial intelligence (AI) chatbot only through text: it can also talk to you and respond to your voice requests. Today, ChatGPT creator OpenAI announced three new voice models that it says will “unlock a new class of voice applications for developers.”

Each AI voice model is designed for a different purpose, including in-depth reasoning, translation, and transcription. If you’re looking for a voice model along these lines, it might be worth a shot.

According to OpenAI, the new models include the following:

Latest videos from

“GPT‑Realtime‑2, our first voice model with GPT‑5 class reasoning capable of handling more difficult requests and moving the conversation forward naturally.
“GPT‑Realtime‑Translate, a new live translation model that translates speech from over 70 input languages to 13 output languages while following the pace of the speaker.
“GPT-Realtime-Whisper, a new streaming text-to-speech that transcribes speech live while the speaker speaks.”

OpenAI’s news post explains that the company has seen developers use AI voice models in three distinct ways: by asking the AI to perform a task; by asking the AI to explain a situation (such as a travel delay) to the user; and having conversations in the user’s local language.

These are the use cases that OpenAI is trying to solve with its new voice models. Each is designed for developers to use in their own applications, and all three are available as part of OpenAI’s Realtime API. GPT-Realtime-2 will cost $32 for one million input tokens and $64 for one million output tokens. GPT-Realtime-Translate costs $0.034 per minute, while GPT-Realtime-Whisper costs $0.017 per minute.

A person uses ChatGPT voice mode on their phone.

(Image credit: OpenAI)

If you’re looking for an AI model that can reason deeply and adapt to conversation flows, OpenAI says the new GPT-Realtime-2 option is for you. Developers can use it to check multiple sources at once, adjust tone based on user input, tap into more advanced levels of reasoning, and analyze specialized terms (such as proper nouns and phrases used in healthcare and manufacturing).

Translation apps, on the other hand, can enable GPT-Realtime-Translate to use real-time speech conversion. Users will be able to speak their own language and have it translated and transcribed without delay. This template works with over 70 input languages and 13 output languages.

And if you want audio transcribed quickly and accurately, there is GPT-Realtime-Whisper. This template is useful for creating captions, meeting notes, and summaries as conversations are happening, OpenAI says, meaning “live products can feel faster, more responsive, and more natural.”

If you want to try out one of the new models, they are available on OpenAI’s Playground site. And if you’re using Codex, OpenAI has created a prompt that will directly add GPT-Realtime-2 to the agent coding platform.

Follow TechRadar on Google News And add us as your favorite source to get our news, reviews and expert opinions in your feeds. Make sure to click the Follow button!

And of course you can too follow TechRadar on TikTok for news, reviews, unboxings in video form and receive regular updates from us on WhatsApp Also.

The best laptops for every budget

Must Read

Leave a Comment Cancel Reply