About
This model was created to support experiments for evaluating phonetic transcription with the Buckeye and TIMIT corpus as part of https://github.com/ginic/multipa. This is a version of excalibur12/wav2vec2-large-lv60_phoneme-timit_english_timit-4k that was further fine-tuned on a subset of the Buckeye corpus. For details about specific model parameters, please view the config.json here or training scripts in the scripts/fine_tuning_experiments folder of the GitHub repository.
Experiment Details
These experiments take a wav2vec2.0 model originially fine-tuned on TIMIT (excalibur12/wav2vec2-large-lv60_phoneme-timit_english_timit-4k) and further fine-tune it on the Buckeye corpus. The random seed is varied to select training data while keeping an even 50/50 gender split to measure significance of changing training data selection.
Goals:
- Determine how additional fine-tuning on different corpora affect performance on test sets for both corpora
- Establish whether data variation with the same gender makeup is statistically significant in changing performance on the test set
Params to vary:
- training data seed (--train_seed)
- batch size: [64, 32] will be indicated at the end of the model name following "bs"
- Downloads last month
- -