Meta (formerly Facebook) is introducing its first artificial intelligence offering since the AI generator industry exploded in late 2022.
The brand’s text-to-audio generator, called Voicebox is expected to be the voice equivalent of ChatGPT, which processes text prompts into detailed written results, and Dall-E which develops realistic artwork. Voicebox in turn will be able to take text prompts and produce audio clips, according to Engadget.
Having trained the new generator on over “50,000 hours of unfiltered audio,” including public domain speech and transcripts in English, French, Spanish, German, Polish, and Portuguese. Voicebox is prepared to develop results in conversational-sounding speech in a variety of available languages. Meta also claims its model has a one percent error rate degradation, in comparison to other models.
According to Meta researchers, the model was trained by having it predict blocks of speech within a transcript instead of having to develop a body of work from scratch. The tool also has the ability to edit audio clips for unwanted noise or misspoken words, in a similar fashion to editing software for still images, such as Adobe Photoshop.
Meta stated it doesn’t plan to release the Voicebox app or source code to the public currently due to “the potential risks of misuse.” This is understandable as recently, the Federal Bureau of Investigation (FBI) issued a warning about the increasing use of deep fake content in crimes, including extortion, blackmail, and harassment.
The company has released audio samples with its research paper introducing the app. It also detailed potential future plans to aid “patients with vocal cord damage, in-game NPCs, and digital assistants.”
Meta is in an interesting position of trying to keep up with the current industry trends. Despite having several models of its Meta Quest VR headsets, it appears the company is no longer moving forward with its plans to develop its metaverse concept in favor of more AI innovation. Meanwhile, Apple recently introduced its first Vision Pro headset and is investing in virtual reality. Currently, Apple hasn’t showcased any major interest in AI.
Related Posts
New study shows AI isn’t ready for office work
A reality check for the "replacement" theory
Google Research suggests AI models like DeepSeek exhibit collective intelligence patterns
The paper, published on arXiv with the evocative title Reasoning Models Generate Societies of Thought, posits that these models don't merely compute; they implicitly simulate a "multi-agent" interaction. Imagine a boardroom full of experts tossing ideas around, challenging each other's assumptions, and looking at a problem from different angles before finally agreeing on the best answer. That is essentially what is happening inside the code. The researchers found that these models exhibit "perspective diversity," meaning they generate conflicting viewpoints and work to resolve them internally, much like a team of colleagues debating a strategy to find the best path forward.
Microsoft tells you to uninstall the latest Windows 11 update
https://twitter.com/hapico0109/status/2013480169840001437?s=20