- The Gemini AI assistant from Google is now supporting audio file downloads.
- The AI will transcribe, summarize and extract key information from the recordings.
- The functionality has 10 minutes of vocal memos, meetings, conferences and interviews with consultable documents.
Google Gemini has just learned to listen and understand what he hears. You can now download audio files on the AI assistant on the web or via mobile applications and obtain transcriptions, summaries and key details.
For anyone who has already let a vocal memo rot in his phone or fear the task of reviewing a meeting of a meeting, this update could be the equivalent of the AI of the hiring of a personal notes.
That said, he can only manage 10 minutes of audio at a time, so not yet long meetings. You can download audio files directly by selecting audio in the usual file download options. What makes it different from the features of the live voice of Gemini Gemini is that it does not only speak of AI in real time.
Gemini Live is useful for occasional orders, but it is more about bringing AI to process data as it does with other formats. In particular, the download of audio files has apparently been the most requested functionality to users, according to the vice-president of Google by Gemini Josh Woodward.
AUDIO AI
✅ Papercut Fixed: You can now download any file from @GeMinIAPP. Including request n ° 1: audio files are now supported! pic.twitter.com/4te3xwlc6wSeptember 8, 2025
I tested it by downloading a few sketches of old comedy albums and a telephone conversation with a friend. The AI has successfully transcribed all the words pronounced in each case, with some small errors linked to the name. It was also good for removing key elements and things that take place for a list of tasks.
Google’s audio request indicate how AI tools are evolving to correspond to how we record information in audio newspapers and vocal memos. Transforming this into something available has generally meant the use of external transcription software. Gemini’s new feature erases this process in a single step.
What makes the addition particularly appropriate is the way in which he argued other recent gemini improvements. Google has already integrated Gemini into applications like, began to test a visual interface based on a card and to considerably expand the Personalization options of Gemini. The ability to treat audio continues this trend.
The audio option is not specific to Gemini among AI assistants, but it can at least correspond to what Chatgpt can do thanks to its Whisper transcription model. In fact, in my tests, I preferred Google’s offer.
Claude d’Anthropic also manages audio in certain developer tools, and Perplexity can extract data from YouTube videos. But the execution of Gemini is more focused on daily use cases.
And the exit is not only a stupid transcription. You can ask Gemini to simplify the language, to extract specific comments from speakers, to generate questions depending on the content or to create a study guide from a class discussion. Of course, the 10 -minute limit puts a certain restraint to be part of daily life. Free level users are also faced with daily use limits.
Google has not published an official ventilation of prices for high volume audio treatment, but this is part of the regular quota of Gemini, so that anyone plans to feed it with a dozen hours of legal deposits should be punctuated.