I wish to request live captions as well.
I am deaf and rely on live captioning for videos that do not have native captions.
I hope this feature is implemented soon. It’s going to be useful to a lot of people.
Typing this in October 2025. Its outrageous how such a simple feature that has been in basic Chrome for OVER 4 YEARS (since March 2021) is STILL is missing from Brave. I love Brave, but I am seriously considering switching back to Chrome if they refuse to enable live captions or create an equivalent feature. I am currently using the Windows 11 live captions feature instead, but it is much more visually obstructive and clunky than the one I remember using in Chrome. If any Brave employees are reading this, please enable this darn feature!
It’s actually NOT open-source (i.e. Brave would have to develop it on its own)
@Mattches Are there any plans on developing this feature?
Umm, kind of. There has been an open Github for it since 2021:
The issue was set to P4 (Priority type 4) back in May 2024.
If you’re not familiar, I’ll give the Github phrasing and then my version of it:
Github phrasing:
- P1 = A very extremely bad problem. We might push a hotfix for it.
- P2 = A bad problem. We might uplift this to the next planned release.
- P3 = The next thing for us to work on. It’ll ride the trains.
- P4 = Planned work. We expect to get to it “soon”.
- P5 = Not scheduled. Don’t anticipate work on this any time soon.
My translation :
- P1 = Basically needs an emergency fix. Give no delay and put all resources to resolving.
- P2 = Bad problem that should be fixed, but not as rushed. Just try to have solution in the next release.
- P3 = We plan to get it done and have the capabilities, but it’s a bit low priority. Unless something just “clicks” this is basically the type of project the devs work on in their “spare time.”
- P4 = Something devs would like to do but either lack the current capabilities to see it to fruition or it’s just going to be an ultimate side project. It’s on their list and eventually may be done, but it could be years out…if ever.
- P5 = Issues and requests that they aren’t denying, but the chances of them ever touching it is slim unless everything else is done or something just “falls in place” that allows it to be done, such as something introduced upstream.
NOTE
For Brave, “soon” often means a year or more. Sometimes quicker, but I’ve just learned to see it as 6-12 months at a minimum and then get pleasantly suprised if it appears earlier.
@Clock and others. Also just to revisit part of the restriction. If you check https://issues.chromium.org/issues/40068794 you’ll see people reported Live Captions as not being available in Chromium. The devs there stated:
Live Caption requires an API key and thus only works on Chrome and not the standard standalone Chromium without API keys.
From there they turned it into a wontfix, meaning they had no intent to improve on it.
And as I was looking at browsers like Vivaldi as well, it seems they don’t have it yet. Why? Well, this reply from one of the people from Vivaldi says it’s because of the API:
The Live Caption system requires support for Speech Recognition, which probably (I haven’t investigated) requires subscribing to a Google Cloud API which very likely requires payment per second of use.
This means Brave would have to subscribe and connect to Google’s API, assuming they are making it available to browsers outside of Chrome, or they would have to build it out themselves. It’s not a quick and simple thing where they can just flip a switch to turn it on.
How does this require an API key or a payment per second of use? Chrome Live Caption is an offline on-device feature requiring a one-time download of the captioning engine.
From Chrome’s blog post announcing live captions:
These captions in Chrome are created on-device, which allows the
captions to appear as the content plays without ever having to leave
your computer. Live Caption also works offline, so you can even caption
audio and video files saved on your hard drive when you play them in
Chrome.
And The Verge article explaining how to use it:
When you toggle them on, Chrome will quickly download some speech
recognition files, and then captions should appear the next time your
browser plays audio where people are talking.
Google Research article explaining how it works on Android that might be applicable to Chrome on desktop: https://research.google/blog/on-device-captioning-with-live-caption/
There is a flag called Multilingual Live Caption (#enable-live-caption-multilang) that is enabled by default but doesn’t seem to do anything. There is no result when searching for “caption” in Brave settings (brave://settings/?search=caption)
It requires components to be downloaded, which they don’t provide without the API key.
Yet one of the Chromium developers closed that issue I linked and explicitly said it requires the API key. It’s not like I pulled that out of my butt but was showing an exact official post explaining it.
Live Caption requires an API key and thus only works on Chrome and not the standard standalone Chromium without API keys.
-
Question: Is it possible for normal person to add API key or other way to make Chromium can work with Live-caption?
-
Answer: Unfortunately not. Due to the proprietary nature of the speech recognition library, it isn’t available for public use.
On Chrome, where Google owns and controls the API. Any other Chromium browser that wants to use it has to pay, if they can even get access to it to begin with.
If you prefer an AI answer, I can let Google’s Gemini answer for you as well:
The Live Caption feature is generally not available in the open-source Chromium browser for third-party developers, even though the flags and related code may exist.
Why Live Caption is Exclusive to Google Chrome
The primary reason Live Caption is exclusive to Google Chrome (and sometimes other officially supported Chromium-based browsers like Microsoft Edge) is its dependency on proprietary Google services and components, specifically for speech recognition.
- Proprietary Speech Recognition: The Live Caption feature utilizes an on-device speech recognition service called SODA (Speech On-Device Always). While the code for integrating with this service may exist in the Chromium codebase (which is open source), the actual language models and necessary backend components (the “API” or core functionality) are proprietary and are typically downloaded from Google’s servers only by official Google products like Chrome.
- Licensing and Service Constraints: Features that rely on Google’s closed-source services, such as Live Caption or translation, are often restricted to Google’s own products due to licensing agreements and the controlled distribution of their proprietary technology.
Developer Access for Other Chromium Browsers
A developer making a different Chromium-based browser would not be able to easily use Google’s Live Caption API/service for the reasons above.
- Flags Do Not Grant Access: The flags remain in the Chromium code because they govern the interface and functionality built into the browser, but they do not automatically enable access to the required proprietary SODA files and Google’s backend infrastructure. Attempting to enable the flags in a non-Google build of Chromium often results in the feature not working.
- The Solution is Independent Implementation: To offer a live caption feature, a developer of a third-party Chromium browser would need to implement their own solution using a different speech-to-text engine. This would involve:
- Integrating with a public third-party speech recognition API (e.g., from another vendor).
- Developing their own on-device model and service to mimic the functionality of SODA.
Then followed up about the API:
That’s an excellent follow-up question that gets to the heart of the distinction between open-source code and proprietary services.
The short answer is no, the exact Live Caption technology used by Google Chrome (SODA) is not available as a publicly purchasable API for third-party developers. However, Google does offer two commercial solutions that achieve real-time speech-to-text functionality, which a developer could use to build a similar feature in a Chromium-based browser:
1. Google Cloud Speech-to-Text API (Public & Paid)
This is the primary way a developer can use Google’s advanced speech recognition technology, and it can be used to create real-time captioning.
- Model: This is a cloud-based solution (your audio is sent to Google’s servers for processing), not the on-device (offline) solution used by Chrome’s Live Caption feature (SODA).
- Functionality: It offers streaming recognition for real-time transcription, which is what you would need for live captioning.
- Cost: It is a paid service based on usage (measured per number of seconds of audio processed).
Where to find more information:
- Google Cloud Speech-to-Text API: https://cloud.google.com/speech-to-text
2. Speech-to-Text On Device (Private & Paid)
While the Live Caption feature in Chrome uses a proprietary, internal Google service called SODA (Speech On-Device Always), Google offers a commercial product with similar capabilities:
- Model: This solution is designed for use on a private device or infrastructure, which is similar to how Chrome processes its Live Captions. This allows for offline recognition and enhanced privacy since data does not leave the device.
- Availability: This feature is generally available to enterprise and business customers through a private channel. It is not a public API that you can simply sign up for online. Developers usually have to contact Google Cloud Sales to get access and integration details.
In summary, if a developer wants to add live captioning to their Chromium browser:
- They cannot use the original, free, on-device Chrome SODA service.
- They can use the Google Cloud Speech-to-Text API to build a paid, cloud-based solution.
- They may be able to gain access to the Speech-to-Text On Device solution by contacting Google Cloud Sales, if their product and business case qualify.
@Saoiray Thank you for pointing out the reasoning, I was misguided by the mention of “Google Cloud API” and “payment per second of use” that you quoted from the person from Vivaldi.
Since the engine is the proprietary download, but the integration code may already be included in the Chromium codebase, would it be possible for someone to download Google Chrome and activate and download Live Caption in Chrome, then have Brave access Chrome’s downloaded engine on disk to provide Live Caption within Brave?
For everyone else here I would also like to mention that Windows 11, macOS 13+ on Apple silicon, and iPhone 11 and later all support offline, on-device live captions, and also support microphone audio input for live captions
also for Android, Linux, python
- Android
- Live Caption app for Linux (guide)
- Vosk for python (all platforms)

