Alsatian (gsw) · Common Voice

Why Common Voice?
Common Voice is a publicly available voice dataset, powered by the voices of volunteer contributors around the world. People who want to build voice applications can use the dataset to train machine learning models. At present, most voice datasets are owned by companies, which stifles innovation. Voice datasets also underrepresent: non-English speakers, people of colour, disabled people, women and LGBTQIA+ people. This means that voice-enabled technology doesn’t work at all for many languages, and where it does work, it may not perform equally well for everyone. We want to change that by mobilising people everywhere to share their voice.
How does Common Voice work?
We’re crowdsourcing an open-source dataset of voices. Donate your voice, validate the accuracy of other people’s clips, make the dataset better for everyone.
Someone asks for a language to be added.
Website Localization
The website text is translated into that language.
Sentence Collection
Sentences are collected for people to read aloud.
New Language Launch
We launch the Common Voice site in this language.
Voice Contribution
People come and contribute their voices.
Voice Validation
Other people validate those voice clips.
Dataset Release
We release the dataset every 3 months.
Want to stay in touch with Common Voice?
Speak
Contributors record voice clips by reading from a bank of donated sentences.
Listen-Queue
Voice clips are entered into a submission queue that readies them for listening.
Listen
Users validate the accuracy of donated clips, checking that the speaker read the sentence correctly.
Is the clip valid?
A voice clip is marked "valid" when a user gives it a Yes vote.
≥ 2 Yes votes
To make it into the Common Voice dataset, a voice clip must be validated by two separate users.
≥ 2 No votes
When a user rejects a voice clip it returns to the Queue. If rejected a second time, the voice clip is moved to the Clip Graveyard.
Common Voice Dataset
The Common Voice Dataset contains hundreds of thousands of voice samples that help developers build voice recognition tools.
Clip Graveyard
The Clip Graveyard consists of voice clips that didn't make it into the Common Voice dataset. Just like the dataset, the Clip Graveyard is available for download. We would like to thank the following people and organizations for their help with the project:
Get involved
Want to help make Common Voice even better? Great! Get in touch via email or <discourseLink>Discourse</discourseLink> forums, submit site issues via <githubLink>GitHub</githubLink>, or join the <matrixLink>Matrix</matrixLink> community chat.
How do I stay in touch?
Sign up
<emailFragment>Sign up</emailFragment> to our mailing list to learn how you can take part in campaigns, events and co-design features on Common Voice.
You can meet others in the Mozilla language communities by joining <discourseLink>Discourse</discourseLink> for topical conversations, or <matrixLink>Matrix</matrixLink> for quick advice.
Why ?
How ?
Partners
Get involved
How does Common Voice work?
Learn how to take part
What is a language on Common Voice?
There are lots of ways to think about language. For the purposes of speech recognition models, Common Voice suggests focussing on ‘mutual intelligibility’, or ‘can speakers of this language mostly understand one another if they try to?’
We want speech models to be better at understanding a diverse range of speakers. For this to happen, a voice dataset must represent lots of different people.
Some languages have enormous variation in grammar, vocabulary and pronunciation. For this reason, we are <ctaLink>introducing ‘Variants’</ctaLink> in 2022. This gives communities a way to distinguish their languages within the larger dataset.
Common Voice is Mozilla's initiative to help teach machines how real people speak.
Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.
Mozilla Common Voice is an initiative to help teach machines how real people speak.
Speak up, contribute here!
Peugah haba, tuléh hinoe!
Voice is natural, voice is human. That’s why we’re fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Project Common Voice, a project to help make voice recognition open to everyone.
Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. It’s that simple!
Voice is natural, voice is human. That’s why we’re excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Common Voice, a project to help make voice recognition open and accessible to everyone.
Read More
Beuet Le Lom
Help us validate sentences!
Neutulông kamoe peusahèh kalimat!
Press play, listen & tell us: did they accurately speak the sentence below?
Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.
Press { shortcut-play-toggle } to toggle play mode
Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.
Clips recorded
Klip teureukam
Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.
Clips validated
Hours Recorded
Hours Validated
Voices Online Now
Today's Progress
Help us get to { $goal }
Have you read our Terms?
Ready to donate your voice?
All
Today
Uroë Nyoë
{ $count }wk
{ $count }mo
{ $count }y
Help us build a high quality, publicly open dataset
Sign up for an account
sign up for email updates
Sign up for Common Voice newsletters, goal reminders and progress updates
Benefits
Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.
Profile information improves the audio data used in training speech recognition accuracy.
Keep track of your progress and metrics across multiple languages.
See how your progress compares to other contributors all over the world.
View your progress against personal and project goals.
Optionally join on our email list for updates and new information about the project.
What's Public?
We will not make your email public.
The number of recordings and which languages you contribute to will be public.
You can choose to make your username public or anonymous.
Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.
Your username and email will not be associated with the published data.
Welcome { $company } staff!
You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.
Log In / Sign Up with { $company } email
Having a profile is not required to contribute though it is helpful, see why below.
Common Voice is Mozilla's initiative to help teach machines how real people speak.
Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.
Mozilla Common Voice is an initiative to help teach machines how real people speak.
Speak up, contribute here!
Peugah haba, tuléh hinoe!
Voice is natural, voice is human. That’s why we’re fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Project Common Voice, a project to help make voice recognition open to everyone.
Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. It’s that simple!
Voice is natural, voice is human. That’s why we’re excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Common Voice, a project to help make voice recognition open and accessible to everyone.
Read More
Beuet Le Lom
Help us validate sentences!
Neutulông kamoe peusahèh kalimat!
Press play, listen & tell us: did they accurately speak the sentence below?
Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.
Press { shortcut-play-toggle } to toggle play mode
Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.
Clips recorded
Klip teureukam
Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.
Clips validated
Hours Recorded
Hours Validated
Voices Online Now
Today's Progress
Help us get to { $goal }
Have you read our Terms?
Ready to donate your voice?
All
Today
Uroë Nyoë
{ $count }wk
{ $count }mo
{ $count }y
Help us build a high quality, publicly open dataset
Sign up for an account
sign up for email updates
Sign up for Common Voice newsletters, goal reminders and progress updates
Benefits
Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.
Profile information improves the audio data used in training speech recognition accuracy.
Keep track of your progress and metrics across multiple languages.
See how your progress compares to other contributors all over the world.
View your progress against personal and project goals.
Optionally join on our email list for updates and new information about the project.
What's Public?
We will not make your email public.
The number of recordings and which languages you contribute to will be public.
You can choose to make your username public or anonymous.
Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.
Your username and email will not be associated with the published data.
Welcome { $company } staff!
You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.
Log In / Sign Up with { $company } email
Having a profile is not required to contribute though it is helpful, see why below.
Common Voice is Mozilla's initiative to help teach machines how real people speak.
Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.
Mozilla Common Voice is an initiative to help teach machines how real people speak.
Speak up, contribute here!
Peugah haba, tuléh hinoe!
Voice is natural, voice is human. That’s why we’re fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Project Common Voice, a project to help make voice recognition open to everyone.
Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. It’s that simple!
Voice is natural, voice is human. That’s why we’re excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Common Voice, a project to help make voice recognition open and accessible to everyone.
Read More
Beuet Le Lom
Help us validate sentences!
Neutulông kamoe peusahèh kalimat!
Press play, listen & tell us: did they accurately speak the sentence below?
Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.
Press { shortcut-play-toggle } to toggle play mode
Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.
Clips recorded
Klip teureukam
Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.
Clips validated
Hours Recorded
Hours Validated
Voices Online Now
Today's Progress
Help us get to { $goal }
Have you read our Terms?
Ready to donate your voice?
All
Today
Uroë Nyoë
{ $count }wk
{ $count }mo
{ $count }y
Help us build a high quality, publicly open dataset
Sign up for an account
sign up for email updates
Sign up for Common Voice newsletters, goal reminders and progress updates
Benefits
Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.
Profile information improves the audio data used in training speech recognition accuracy.
Keep track of your progress and metrics across multiple languages.
See how your progress compares to other contributors all over the world.
View your progress against personal and project goals.
Optionally join on our email list for updates and new information about the project.
What's Public?
We will not make your email public.
The number of recordings and which languages you contribute to will be public.
You can choose to make your username public or anonymous.
Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.
Your username and email will not be associated with the published data.
Welcome { $company } staff!
You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.
Log In / Sign Up with { $company } email
Having a profile is not required to contribute though it is helpful, see why below.
Common Voice is Mozilla's initiative to help teach machines how real people speak.
Su Common nakeuh Inisiatif Mozilla jibantu peurunoe meusen kiban cara ureueng geupeugah haba.
Mozilla Common Voice is an initiative to help teach machines how real people speak.
Speak up, contribute here!
Peugah haba, tuléh hinoe!
Voice is natural, voice is human. That’s why we’re fascinated with creating usable voice technology for our machines. But to create voice systems, an extremely large amount of voice data is required.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Project Common Voice, a project to help make voice recognition open to everyone.
Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web. Read a sentence to help machines learn how real people speak. Check the work of other contributors to improve the quality. It’s that simple!
Voice is natural, voice is human. That’s why we’re excited about creating usable voice technology for our machines. But to create voice systems, developers need an extremely large amount of voice data.
Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation. So we’ve launched Common Voice, a project to help make voice recognition open and accessible to everyone.
Read More
Beuet Le Lom
Help us validate sentences!
Neutulông kamoe peusahèh kalimat!
Press play, listen & tell us: did they accurately speak the sentence below?
Looks like there aren't any clips to listen to in this language. Help us fill the queue by recording some now.
Press { shortcut-play-toggle } to toggle play mode
Recording voice clips is an integral part of building our open dataset; some would say it's the fun part too.
Clips recorded
Klip teureukam
Validating donated clips is equally important to the Common Voice mission. Take a listen and help us create quality open source voice data.
Clips validated
Hours Recorded
Hours Validated
Voices Online Now
Today's Progress
Help us get to { $goal }
Have you read our Terms?
Ready to donate your voice?
All
Today
Uroë Nyoë
{ $count }wk
{ $count }mo
{ $count }y
Help us build a high quality, publicly open dataset
Sign up for an account
sign up for email updates
Sign up for Common Voice newsletters, goal reminders and progress updates
Benefits
Make your submitted data as rich as possible by providing some anonymous demographic data. We de-identify all demographic data before making it public.
Profile information improves the audio data used in training speech recognition accuracy.
Keep track of your progress and metrics across multiple languages.
See how your progress compares to other contributors all over the world.
View your progress against personal and project goals.
Optionally join on our email list for updates and new information about the project.
What's Public?
We will not make your email public.
The number of recordings and which languages you contribute to will be public.
You can choose to make your username public or anonymous.
Optionally submitted demographic data (e.g. age, gender, language, and accent) will never be made public on your profile, and will not be linked to your account in the dataset. Individual audio clips will be associated with demographic data for the purpose of more accurate analysis - for example, a researcher might want to target a training model to a specific demographic segment.
Your username and email will not be associated with the published data.
Welcome { $company } staff!
You can help build a diverse, open-source dataset by creating a Common Voice profile and contributing your voice.
Log In / Sign Up with { $company } email
Having a profile is not required to contribute though it is helpful, see why below.
Let's Get Started
Welcome to Common Voice
Interested in learning more and contributing to the project?
Common Voice is the world’s largest publicly available, multi-language voice dataset.
Thanks to contributions from over 259k people in over 50 languages, this data is being used to train speech-enabled applications to better respond to the human voice.
Next
Back
Browse Languages
2019 End-of-Year Release
Voice Dataset, Ready for Download
Account
Having an account is not required to contribute, though it is helpful.
To the right we outline the benefits and clarify what information we make public. Use the links below to get started with a Common Voice account on your own device.
Enter email to send a sign up link
Send sign up link
Ready to add your voice or lend your ear?
Now that you know a little bit more about Common Voice, why not try it out? Click on the microphone icon to start reading sentences aloud. If you prefer to review other people's voice contributions, click on the play icon. You’ll help confirm that recordings match the sentences written on screen.
Ready to contribute?
Personal dashboards keep you up-to-date with individual and community progress.
For every voice clip donated, and every audio clip validated, your account dashboards are updated to reflect your latest progress in each language you contribute to. Yes, you can contribute to more than one! Use dashboards to track your stats, see how you're doing alongside others in the community, and set daily or weekly contribution goals.

Why Common Voice?

GROUP COMMENT ABOUT US

CONTEXT about-title•web/locales/common-voice/en/pages/about.ftl•Common Voice

CREATED November 8, 2024 01:11:41 PM

No translations available.

TERMS
COMMENTS

No terms available.

MACHINERY
LOCALES183