July 23, 2024
Video startup Captions launches AI Lipdub with Gen Z slang

VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

Captions, the two-year-old AI video startup founded by Gaurav Misra, former head of design engineering at Snap and a former software development engineer (II) at Microsoft, is not resting on its laurels.

Fresh off of a $25 million funding round from top VCs announced earlier this summer, the company is also launching a new, dedicated AI dubbing app called Lipdub, which automatically translates and dubs any prerecorded video with spoken audio into 28 languages, matching the speaker’s lip movements to the spoken words of the translated language using artificial intelligence.

The app is iOS only to start, but is free to download and does not require an existing Captions app account. Like Captions, users can take the videos they edit with it and publish them to other popular platforms, such as YouTube, TikTok, and Instagram Reels.

Translating spoken words into other languages — and Gen Z slang

In addition, Lipdub seeks to present a competitive edge over an increasingly crowded field of AI dubbing and audio translation services — the ability to translate using “dialects and even vocabulary, with options like Gen Z and Texas slang.”


AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.


Learn More

“AI Dubbing’s success inspired us to push this technology to the next level and introduce synced lip movement to the mix,” said Misra in a statement provided to VentureBeat by a company spokesperson. “Since then we’ve been focused on making the technology widely available, leading us to create and launch Lipdub as its own, separate app.”

Presently, Captions counts more than 100,000 daily users and upwards of 5 million “creators” who have tried its products, including its iOS video creation and editing app, and its website where creators can upload and compress video, use its AI Eye Contact feature to automatically correct videos where the speaker wasn’t looking at the camera, and automatically add AI-generated subtitles and captions to videos (as the company’s name would suggest).

Other features within the Captions iOS app include an AI Trim that automatically removes “filler” words such as “uhs” and “umms,” and an AI Enhance Speech that removes background noise.

Among those using the app are Disney-owned sports network ESPN and its commentator Omar Raja, “Mr. Wonderful” of Shark Tank fame, Twitch’s founder Justin Kan and the influencer Unnecessary Inventions.

A raucously competitive space

The news comes mere hours after rival ElevenLabs, founded by former Google and Palantir employees, announced its own prerecorded video AI Dubbing feature with support for 20 spoken languages. ElevenLabs is similarly well-funded, having raised a $19 million series A over the summer. It is also integrated and partnered with other startups such as AI personality cloning/personalized chatbot service Delphi.

But ElevenLabs is not the only well-funded competitor to Captions in the AI Dubbing and video/audio editing space.

Meta Platforms recently launched SeamlessM4T, an open-source multilingual foundational model that can understand nearly 100 languages from speech or text and generate translations into either or both in real-time.

Meanwhile, other startups including MURF.AIPlay.ht and WellSaid Labs offer “voice cloning” or completely synthetic AI voice generation that can be integrated with video, though for many of these, the emphasis is far more on audio, and they do not offer the breadth of video editing tools that Captions does.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source link