OpenAI launches new voice intelligence features in its API

Tecnología07.May.2026 22:242 min read

OpenAI has added new voice intelligence capabilities to its API, including real-time conversation, translation, and transcription models. The company says the tools can support customer service, education, media, and creator platforms.

OpenAI launches new voice intelligence features in its API

OpenAI said Thursday that its API now includes a number of new voice intelligence features designed to help developers create apps that can talk, transcribe, and translate conversations with users.

GPT‑Realtime‑2

The company’s GPT‑Realtime‑2 is a new voice model built to create a realistic vocal simulation that can converse with users. Unlike its predecessor, GPT‑Realtime‑1.5, the new model is built with GPT‑5‑class reasoning that OpenAI says was designed to handle more complicated user requests.

GPT‑Realtime‑Translate

OpenAI is also launching GPT‑Realtime‑Translate, a model designed to provide real-time translation services that “keep pace” with the user in conversation. The feature supports more than 70 input languages—the languages it can comprehend—and 13 output languages, which are relayed to the speaker.

GPT‑Realtime‑Whisper

The company has introduced a new transcription capability called GPT‑Realtime‑Whisper. The model offers live speech-to-text functionality, capturing transcriptions as interactions occur.

“Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.

Use Cases and Applications

Companies seeking to expand customer service capabilities are an obvious target for the new tools. However, OpenAI says the features can also support a wide range of applications, including education, media, events, and creator platforms.

Safety and Guardrails

OpenAI acknowledged that, while the tools may be useful for enterprises, they could also be misused. The company said it has built guardrails to prevent abuse such as spam, fraud, or other forms of online harm. According to OpenAI, certain triggers are embedded in the system so that “conversations can be halted if they are detected as violating our harmful content guidelines.”

Availability and Pricing

All of the new voice models are included in OpenAI’s Realtime API. GPT‑Realtime‑Translate and GPT‑Realtime‑Whisper are billed by the minute, while GPT‑Realtime‑2 is billed based on token consumption.