--- license: mit language: - en pipeline_tag: automatic-speech-recognition --- # About This model was created to support experiments for evaluating phonetic transcription with the Buckeye corpus as part of https://github.com/ginic/multipa. This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a specific subset of the Buckeye corpus. For details about specific model parameters, please view the config.json here or training scripts in the scripts/buckeye_experiments folder of the GitHub repository. # Experiment Details The entire train split of the Buckeye corpus was used to train this model. The only data excluded are samples in the train split that are too short (< 0.1 seconds) or too long (>12 seconds) to be used to train the model Goals: - Include the largest amount of training data possible. - Can be used with a different corpus (e.g. TIMIT, Speech Accent Archive) for evaluation to test generalization to other dialects and language varieties.