The SNIPS dataset is a ~54-hour corpus of wake word corpus covering 3300 speakers. The wake word is "Hey Snips" pronounced with no pause between the two words. It contains a large variety of English accents and recording environments. Negative samples have been recorded in the same conditions than wake-word utterances. To download the dataset you need to follow the instructions on It is provided by Snips, Paris, France ( The recipe is in v1/ The E2E LF-MMI recipe does not require any prior alignments for training LF-MMI, making the alignment more flexible during training. It can be optionally followed by a regular LF-MMI training to further improve the performance.