Mozilla updates its open source voice data library

Mozilla’s Common Voice project released a large voice dataset last week. Common Voice is an open source collection of transcribed recordings and metadata for voice app and voice-enabled device designers to use.

The Common Voice project boasts 5.5 million clips in 54 languages over 7,226 hours. Mozilla wants Common Voice users to integrate the data with its DeepSpeech toolkit of voice and text models.

Volunteers upload recorded clips of themselves speaking to the Common Voice project. Then, the transcribed sentences are collected in a voice database under the CC0 license. This allows developers to use the clips sans costs and copyright restrictions.

Common Voice aims to fill gaps left by common voice tech apps, which are often critiqued for not being trained on diverse datasets representing a range of accents, inflections, and languages. Along with its recent update to Common Voice, Mozilla also improved DeepSpeech’s speed of recognition recently.

Source: opensource.com

Datacenter

Mozilla updates its open source voice data library

Governor approves tax breaks for Google data center in Henderson

Metaplanet “Asia’s Microstrategy” Raises 10 Billion Yen to Boost Bitcoin Holdings

Quantum computer ‘threat’ to crypto is exaggerated — for now

Mozilla updates its open source voice data library

Google Cloud unveils new tools to democratise AI

Governor approves tax breaks for Google data center in Henderson

Metaplanet “Asia’s Microstrategy” Raises 10 Billion Yen to Boost Bitcoin Holdings

Quantum computer ‘threat’ to crypto is exaggerated — for now