How to setup the BABEL database multilingual training environment ================================================================= a) Preparation: you need to make sure the BABEL data and the F4DE scoring software is set up as it is in JHU, or change this setup accordingly. This will probably be hard and will involve some trial and error. Some relevant pathnames can be found in conf/lang/* and ./path.sh This step is as same as (a) in normal babel (egs/babel/s5d). b) Prepare the data and alignments for languages in multilingual setup. i) create empty directory exp/language-name, data/language-name, conf/language-name. e.g. mkdir exp/101-cantonese; mkdir conf/101-cantonese; mkdir data/101-cantonese language-name should be the name used in config file in conf/lang. ii) prepare the data and alignment tri5 (Read egs/babel/s5d/README.txt for more details.) iii) make soft-link in data/lang-name, conf/lang-name and exp/lang-name to corresponding data, conf and exp dir for all languages. e.g. ( cd data/101-cantonese ln -s /path-to-101-cantonese-data-dir/train . ln -s /path-to-101-cantonese-data-dir/lang . link appropriate language-specific config file to lang.conf in each directory. cd conf/101-cantonese ln -s /path-to-101-cantonese-config lang.conf e.g. ln -s ../lang/101-cantonese-limitedLP.official.conf lang.conf cd exp/101-cantonese ln -s /path-to-101-cantonese-exp-dir/tri5 . ) iv) you should create local.conf and define training config for multilingual training e.g. s5/local.conf cat < local.conf use_pitch=true use_ivector=true # lda-mllt transform used to train global-ivector lda_mllt_lang=101-cantonese # lang_list is space-separated language list used for multilingual training lang_list=(101-cantonese 102-assamese 103-bengali) # lang2weight is comma-separated list of weights, one per language, used to # scale example's output w.r.t its input language during training. lang2weight="0.3,0.3,0.4" # The language list used for decoding. decode_lang_list=(101-cantonese) EOF Running the multilingual training script ========================================= a) You can run the following script to train multilingual TDNN model using xent objective. local/nnet3/run_tdnn_multilingual.sh This script does the following steps. i) Generates 3 speed-perturbed version of training data and its high resolution 40-dim MFCC (+pitch) features and tri5_ali{_sp} ii) Creates pooled training data using all training languages and generates global i-vector extractor over pooled data. iii) Generates separate egs-dir in exp/lang-name/nnet3/egs for all languages in lang_list iv) Creates multilingual-egs-dir and train the multilingual model. v) Generates decoding results for languages in decode_lang_list. b) You can run the following script to train multilingual model with bottleneck layer with dim 'bnf_dim' and generate bottleneck features for 'lang-name' in data/lang-name/train{_sp}_bnf and train SAT model on top of MFCC+BNF features (exp/lang-name/tri6). local/nnet3/run_multilingual_bnf.sh --bnf-dim bnf_dim lang-name You can also use trained multilingual model (the default component name used to extract bnf is tdnn_bn.renorm) as local/nnet3/run_multilingual_bnf.sh \ --multilingual-dir exp/nnet3/tdnn_multi_bnf lang-name