---
license: mit
language:
- en
pipeline_tag: automatic-speech-recognition
---
# About 
This model was created to support experiments for evaluating phonetic transcription 
with the Buckeye corpus as part of https://github.com/ginic/multipa. 
This is a version of facebook/wav2vec2-large-xlsr-53 fine tuned on a specific subset of the Buckeye corpus.
For details about specific model parameters, please view the config.json here or 
training scripts in the scripts/buckeye_experiments folder of the GitHub repository. 

# Experiment Details
The entire train split of the Buckeye corpus was used to train this model. 
The only data excluded are samples in the train split that are too short (< 0.1 seconds) or too long (>12 seconds) to be used to train the model 

Goals: 
- Include the largest amount of training data possible. 
- Can be used with a different corpus (e.g. TIMIT, Speech Accent Archive) for evaluation to test generalization to other dialects and language varieties.