• Why Common Voice?

  • <p>Common Voice is a publicly available voice dataset, powered by the voices of volunteer contributors around the world. People who want to build voice applications can use the dataset to train machine learning models.</p> <p>At present, most voice datasets are owned by companies, which stifles innovation. Voice datasets also underrepresent: non-English speakers, people of colour, disabled people, women and LGBTQIA+ people. This means that voice-enabled technology doesnt work at all for many languages, and where it does work, it may not perform equally well for everyone. We want to change that by mobilising people everywhere to share their voice.</p>

  • How does Common Voice work?

  • Were crowdsourcing an open-source dataset of voices. Donate your voice, validate the accuracy of other peoples clips, make the dataset better for everyone.

  • Someone asks for a language to be added.

  • Website Localization

  • The website text is translated into that language.

  • Sentence Collection

  • Sentences are collected for people to read aloud.

  • New Language Launch

  • We launch the Common Voice site in this language.

  • Voice Contribution

  • People come and contribute their voices.

  • Voice Validation

  • Other people validate those voice clips.

  • Dataset Release

  • We release the dataset every 3 months.

  • Want to stay in touch with Common Voice?

  • Speak

  • Contributors record voice clips by reading from a bank of donated sentences.

  • Listen-Queue

  • Voice clips are entered into a submission queue that readies them for listening.

  • Listen

  • Users validate the accuracy of donated clips, checking that the speaker read the sentence correctly.

  • Is the clip valid?

  • A voice clip is marked "valid" when a user gives it a Yes vote.

  • 2 Yes votes

  • To make it into the Common Voice dataset, a voice clip must be validated by two separate users.

  • 2 No votes

  • When a user rejects a voice clip it returns to the Queue. If rejected a second time, the voice clip is moved to the Clip Graveyard.

  • Common Voice Dataset

  • The Common Voice Dataset contains hundreds of thousands of voice samples that help developers build voice recognition tools.

  • Clip Graveyard

  • The Clip Graveyard consists of voice clips that didn't make it into the Common Voice dataset. Just like the dataset, the Clip Graveyard is available for download. We would like to thank the following people and organizations for their help with the project:

  • Get involved

  • Want to help make Common Voice even better? Great! Get in touch via email or <discourseLink>Discourse</discourseLink> forums, submit site issues via <githubLink>GitHub</githubLink>, or join the <matrixLink>Matrix</matrixLink> community chat.

  • How do I stay in touch?

  • Sign up

  • <emailFragment>Sign up</emailFragment> to our mailing list to learn how you can take part in campaigns, events and co-design features on Common Voice.

  • You can meet others in the Mozilla language communities by joining <discourseLink>Discourse</discourseLink> for topical conversations, or <matrixLink>Matrix</matrixLink> for quick advice.

  • Why ?

  • How ?

  • Partners

  • Get involved

  • How does Common Voice work?

  • Learn how to take part

  • What is a language on Common Voice?

  • There are lots of ways to think about language. For the purposes of speech recognition models, Common Voice suggests focussing on mutual intelligibility, or can speakers of this language mostly understand one another if they try to?

  • We want speech models to be better at understanding a diverse range of speakers. For this to happen, a voice dataset must represent lots of different people.

  • Some languages have enormous variation in grammar, vocabulary and pronunciation. For this reason, we are <ctaLink>introducing Variants</ctaLink> in 2022. This gives communities a way to distinguish their languages within the larger dataset.

  • Common Voice is Mozilla's initiative to help teach machines how real people speak.

    Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.

  • Mozilla Common Voice is an initiative to help teach machines how real people speak.

  • Speak up, contribute here!

    Peugah haba, tuléh hinoe!

  • Voice is natural, voice is human. Thats why were fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Project Common Voice, a project to help make voice recognition open to everyone.

  • Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. Its that simple!

  • Voice is natural, voice is human. Thats why were excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Common Voice, a project to help make voice recognition open and accessible to everyone.

  • Read More

    Beuet Le Lom

  • Help us validate sentences!

    Neutulông kamoe peusahèh kalimat!

  • Press play, listen & tell us: did they accurately speak the sentence below?

  • Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.

  • Press { shortcut-play-toggle } to toggle play mode

  • Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.

  • Clips recorded

    Klip teureukam

  • Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.

  • Clips validated

  • Hours Recorded

  • Hours Validated

  • Voices Online Now

  • Today's Progress

  • Help us get to { $goal }

  • Have you read our Terms?

  • Ready to donate your voice?

  • All

  • Today

    Uroë Nyoë

  • { $count }wk

  • { $count }mo

  • { $count }y

  • Help us build a high quality, publicly open dataset

  • Sign up for an account

  • sign up for email updates

  • Sign up for Common Voice newsletters, goal reminders and progress updates

  • Benefits

  • Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.

  • Profile information improves the audio data used in training speech recognition accuracy.

  • Keep track of your progress and metrics across multiple languages.

  • See how your progress compares to other contributors all over the world.

  • View your progress against personal and project goals.

  • Optionally join on our email list for updates and new information about the project.

  • What's Public?

  • We will not make your email public.

  • The number of recordings and which languages you contribute to will be public.

  • You can choose to make your username public or anonymous.

  • Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.

  • Your username and email will not be associated with the published data.

  • Welcome { $company } staff!

  • You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.

  • Log In / Sign Up with { $company } email

  • Having a profile is not required to contribute though it is helpful, see why below.

  • Common Voice is Mozilla's initiative to help teach machines how real people speak.

    Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.

  • Mozilla Common Voice is an initiative to help teach machines how real people speak.

  • Speak up, contribute here!

    Peugah haba, tuléh hinoe!

  • Voice is natural, voice is human. Thats why were fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Project Common Voice, a project to help make voice recognition open to everyone.

  • Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. Its that simple!

  • Voice is natural, voice is human. Thats why were excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Common Voice, a project to help make voice recognition open and accessible to everyone.

  • Read More

    Beuet Le Lom

  • Help us validate sentences!

    Neutulông kamoe peusahèh kalimat!

  • Press play, listen & tell us: did they accurately speak the sentence below?

  • Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.

  • Press { shortcut-play-toggle } to toggle play mode

  • Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.

  • Clips recorded

    Klip teureukam

  • Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.

  • Clips validated

  • Hours Recorded

  • Hours Validated

  • Voices Online Now

  • Today's Progress

  • Help us get to { $goal }

  • Have you read our Terms?

  • Ready to donate your voice?

  • All

  • Today

    Uroë Nyoë

  • { $count }wk

  • { $count }mo

  • { $count }y

  • Help us build a high quality, publicly open dataset

  • Sign up for an account

  • sign up for email updates

  • Sign up for Common Voice newsletters, goal reminders and progress updates

  • Benefits

  • Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.

  • Profile information improves the audio data used in training speech recognition accuracy.

  • Keep track of your progress and metrics across multiple languages.

  • See how your progress compares to other contributors all over the world.

  • View your progress against personal and project goals.

  • Optionally join on our email list for updates and new information about the project.

  • What's Public?

  • We will not make your email public.

  • The number of recordings and which languages you contribute to will be public.

  • You can choose to make your username public or anonymous.

  • Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.

  • Your username and email will not be associated with the published data.

  • Welcome { $company } staff!

  • You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.

  • Log In / Sign Up with { $company } email

  • Having a profile is not required to contribute though it is helpful, see why below.

  • Common Voice is Mozilla's initiative to help teach machines how real people speak.

    Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.

  • Mozilla Common Voice is an initiative to help teach machines how real people speak.

  • Speak up, contribute here!

    Peugah haba, tuléh hinoe!

  • Voice is natural, voice is human. Thats why were fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Project Common Voice, a project to help make voice recognition open to everyone.

  • Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. Its that simple!

  • Voice is natural, voice is human. Thats why were excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Common Voice, a project to help make voice recognition open and accessible to everyone.

  • Read More

    Beuet Le Lom

  • Help us validate sentences!

    Neutulông kamoe peusahèh kalimat!

  • Press play, listen & tell us: did they accurately speak the sentence below?

  • Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.

  • Press { shortcut-play-toggle } to toggle play mode

  • Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.

  • Clips recorded

    Klip teureukam

  • Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.

  • Clips validated

  • Hours Recorded

  • Hours Validated

  • Voices Online Now

  • Today's Progress

  • Help us get to { $goal }

  • Have you read our Terms?

  • Ready to donate your voice?

  • All

  • Today

    Uroë Nyoë

  • { $count }wk

  • { $count }mo

  • { $count }y

  • Help us build a high quality, publicly open dataset

  • Sign up for an account

  • sign up for email updates

  • Sign up for Common Voice newsletters, goal reminders and progress updates

  • Benefits

  • Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.

  • Profile information improves the audio data used in training speech recognition accuracy.

  • Keep track of your progress and metrics across multiple languages.

  • See how your progress compares to other contributors all over the world.

  • View your progress against personal and project goals.

  • Optionally join on our email list for updates and new information about the project.

  • What's Public?

  • We will not make your email public.

  • The number of recordings and which languages you contribute to will be public.

  • You can choose to make your username public or anonymous.

  • Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.

  • Your username and email will not be associated with the published data.

  • Welcome { $company } staff!

  • You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.

  • Log In / Sign Up with { $company } email

  • Having a profile is not required to contribute though it is helpful, see why below.

  • Common Voice is Mozilla's initiative to help teach machines how real people speak.

    Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.

  • Mozilla Common Voice is an initiative to help teach machines how real people speak.

  • Speak up, contribute here!

    Peugah haba, tuléh hinoe!

  • Voice is natural, voice is human. Thats why were fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Project Common Voice, a project to help make voice recognition open to everyone.

  • Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. Its that simple!

  • Voice is natural, voice is human. Thats why were excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.

  • Most of the data used by large companies isnt available to the majority of people. We think that stifles innovation. So weve launched Common Voice, a project to help make voice recognition open and accessible to everyone.

  • Read More

    Beuet Le Lom

  • Help us validate sentences!

    Neutulông kamoe peusahèh kalimat!

  • Press play, listen & tell us: did they accurately speak the sentence below?

  • Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.

  • Press { shortcut-play-toggle } to toggle play mode

  • Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.

  • Clips recorded

    Klip teureukam

  • Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.

  • Clips validated

  • Hours Recorded

  • Hours Validated

  • Voices Online Now

  • Today's Progress

  • Help us get to { $goal }

  • Have you read our Terms?

  • Ready to donate your voice?

  • All

  • Today

    Uroë Nyoë

  • { $count }wk

  • { $count }mo

  • { $count }y

  • Help us build a high quality, publicly open dataset

  • Sign up for an account

  • sign up for email updates

  • Sign up for Common Voice newsletters, goal reminders and progress updates

  • Benefits

  • Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.

  • Profile information improves the audio data used in training speech recognition accuracy.

  • Keep track of your progress and metrics across multiple languages.

  • See how your progress compares to other contributors all over the world.

  • View your progress against personal and project goals.

  • Optionally join on our email list for updates and new information about the project.

  • What's Public?

  • We will not make your email public.

  • The number of recordings and which languages you contribute to will be public.

  • You can choose to make your username public or anonymous.

  • Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.

  • Your username and email will not be associated with the published data.

  • Welcome { $company } staff!

  • You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.

  • Log In / Sign Up with { $company } email

  • Having a profile is not required to contribute though it is helpful, see why below.

  • Let's Get Started

  • Welcome to Common Voice

  • Interested in learning more and contributing to the project?

  • Common Voice is the worlds largest publicly available, multi-language voice dataset.

  • Thanks to contributions from over 259k people in over 50 languages, this data is being used to train speech-enabled applications to better respond to the human voice.

  • Next

  • Back

  • Browse Languages

  • 2019 End-of-Year Release

  • Voice Dataset, Ready for Download

  • Account

  • Having an account is not required to contribute, though it is helpful.

  • To the right we outline the benefits and clarify what information we make public. Use the links below to get started with a Common Voice account on your own device.

  • Enter email to send a sign up link

  • Send sign up link

  • Ready to add your voice or lend your ear?

  • Now that you know a little bit more about Common Voice, why not try it out? Click on the microphone icon to start reading sentences aloud. <br/><br/>If you prefer to review other people's voice contributions, click on the play icon. Youll help confirm that recordings match the sentences written on screen.

  • Ready to contribute?

  • Personal dashboards keep you up-to-date with individual and community progress.

  • For every voice clip donated, and every audio clip validated, your account dashboards are updated to reflect your latest progress in each language you contribute to. Yes, you can contribute to more than one!<br/><br/> Use dashboards to track your stats, see how you're doing alongside others in the community, and set daily or weekly contribution goals.

Why Common Voice?


No translations available.

No terms available.