About farsdat: Available as ELRA corpus ELRA-S0112, farsdat is counterpart of TIMIT for Persian language. Description of catalog from ELRA (http://catalog.elra.info/product_info.php?products_id=18) "The farsdat corpus of read speech is designed to provide speech data for acoustic-phonetic studies and for the development and evaluation of Persian automatic speech recognition systems. TIMIT contains broadband recordings of 304 speakers of ten major dialects of Iranian Farsi language, each reading ten phonetically rich sentences. The farsdat corpus includes time-aligned orthographic, phonetic and word transcriptions as well as a 16-bit, 22050Hz speech waveform file for each utterance." s5: the currently recommended recipe.