Data required for system development (on top of the data for testing described in ../README.txt), consists of Fisher, past NIST SREs, and Switchboard cellular. You can probably get by OK with just one part of Fisher. Speech Transcripts (see note) Fisher part 1 LDC2004S13 LDC2004T19 Fisher part 2 LDC2005S13 LDC2005T19 SRE 2004 Test LDC2006S44 SRE 2005 Test LDC2011S04 SWBD Cellular 1 LDC2001S13 SWBD Cellular 2 LDC2004S07 Note: The distributions with the transcripts are not really needed for the transcripts themselves, but because that's where the speaker information resides (so we know which recordings are from the same speaker). This is needed for PLDA estimation. However, bear in mind that Fisher is not believed to be very good for things like PLDA estimation. In newer recipes such as ../../sre10/v1 we use past SRE data for PLDA estimation.