Format | WAV |
License | CC BY 4.0 |
Domain | Audio |
Number of Records | 65,000 WAV Files |
Data Split | Train - 51,094 audio clips, Validation - 6,798 audio clips, Test - 6,835 audio clips |
Size | 1.49 GB |
Data Origin | The audio clips were originally collected by Google, and recorded by volunteers in uncontrolled locations around the world. |
Dataset Version | Version 1 – March 17, 2020 |
Dataset Coverage |
Core words Auxiliary words Background noise |
Business Use Case |
|
Feature | Description |
---|---|
Audio clip folders (Duration - one second) |
30 audio clip folders. Each folder name is labelled with the word that is spoken. 30 folders ( 20 core
words, 10 auxillary words). Audio clip name contains the id of the participant. For example, the file path `happy/3cfc6b3a_nohash_2.wav` indicates that the word spoken was "happy", the speaker's id was "3cfc6b3a", and this is the third utterance (indicated by `2`) of that word by this speaker in the data set. First utterance is indicated by `0` at the end of the file name. The 'nohash' section is to ensure that all the utterances by a single speaker are sorted into the same training partition, to keep very similar repetitions from giving unrealistically optimistic evaluation scores. |
Backgroud noise audio clip folder | The `_background_noise_` folder contains a set of
longer audio clips that are either recordings or mathematical simulations of
noise. For more details, see the `_background_noise_/README.md`. |