About

This model was created to support experiments for evaluating phonetic transcription with the Buckeye and TIMIT corpus as part of https://github.com/ginic/multipa. This is a version of excalibur12/wav2vec2-large-lv60_phoneme-timit_english_timit-4k that was further fine-tuned on a subset of the Buckeye corpus. For details about specific model parameters, please view the config.json here or training scripts in the scripts/fine_tuning_experiments folder of the GitHub repository.

Experiment Details

These experiments take a wav2vec2.0 model originially fine-tuned on TIMIT (excalibur12/wav2vec2-large-lv60_phoneme-timit_english_timit-4k) and further fine-tune it on the Buckeye corpus. The random seed is varied to select training data while keeping an even 50/50 gender split to measure significance of changing training data selection.

Goals:

Determine how additional fine-tuning on different corpora affect performance on test sets for both corpora
Establish whether data variation with the same gender makeup is statistically significant in changing performance on the test set

Params to vary:

training data seed (--train_seed)
batch size: [64, 32] will be indicated at the end of the model name following "bs"

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Collection including ginic/wav2vec2-large-lv60_phoneme-timit_english_timit-4k_buckeye-4k_bs64_3

Wav2IPA

Collection

Tools and models built as part of the Wav2IPA project at University of Massachusetts, Amherst • 93 items • Updated Jan 11