MoulSot v0.3 β€” Automatic Speech Recognition for Moroccan Darija

Model Description

MoulSot v0.3 is an Automatic Speech Recognition (ASR) model developed by Atlasia for Moroccan Darija, designed to handle real-world speech with strong robustness to code-switching (Darija ↔ French ↔ Arabic).

The model is trained on a large, curated dataset and aims to provide high-quality transcription for conversational, media, and user-generated audio in Moroccan dialect.


Key Features

  • πŸ”Š 1,500 hours of curated Darija audio
  • πŸ† 80 hours of gold-standard transcriptions
  • ⚑ Native support for code-switching
  • πŸŽ™οΈ Designed for real-world noisy and conversational speech

Training Data

Overview

The MoulSot dataset is built from a large-scale data pipeline focused on collecting, filtering, and curating Moroccan Darija audio.

  • Total audio: ~1,500 hours

  • High-quality labeled subset: 80 hours

  • Speech characteristics:

    • Spontaneous and conversational
    • Multi-domain (media, informal speech, etc.)
    • Code-switching between Darija, English, French, and Modern Standard Arabic

Data Pipeline

The full data processing pipeline is open-sourced: πŸ‘‰ https://github.com/atlasia-ma/MoulSot

It includes:

  • Audio collection and aggregation
  • Cleaning and normalization
  • Segmentation
  • Annotation workflows
  • Quality filtering

Training Procedure

While full technical details are released progressively, the model training includes:

  • Supervised training on high-quality labeled data (80h)
  • Optimization for multilingual/code-switched contexts
  • Iterative refinement across versions

More details will be available in the technical blog.


Intended Use

Primary Use Cases

  • Transcription of Moroccan Darija audio
  • Voice interfaces and assistants
  • Media and content indexing
  • Call center and conversational AI
  • Speech analytics in Moroccan context

Out-of-Scope Use

  • Critical decision-making systems without human validation
  • Languages outside Darija/English/French/Arabic code-switching context
  • Highly specialized domains without adaptation

Performance

MoulSot v0.3 is positioned as a state-of-the-art Darija ASR system, with strong qualitative performance in:

  • Conversational fluency
  • Handling mixed-language speech
  • Robustness to accents and informal usage

Benchmark Comparison

Results (↓ lower is better):

  • MoulSot v0.1 β€” WER: 66.78 / CER: 21.30
  • MoulSot v0.3 β€” WER: 38.99 / CER: 12.58
  • ISMA β€” WER: 39.20 / CER: 13.47

MoulSot v0.3 significantly improves over v0.1 and achieves state-of-the-art performance, slightly outperforming ISMA on both WER and CER.

Leaderboard Ranking

MoulSot v0.3 ranks #1 on the Moroccan Darija ASR leaderboard:

πŸ‘‰ https://huggingface.co/spaces/abdeljalilELmajjodi/moroccan_darija_asr_leaderboard

Key takeaways:

  • πŸ₯‡ Ranked 1st place among evaluated Darija ASR models
  • πŸ“Š Confirms strong real-world performance beyond internal benchmarks
  • πŸ” Validates improvements over previous versions and competing systems

Quantitative benchmarks and detailed evaluations will be expanded in the technical report.


Limitations

  • Performance may degrade on:

    • Extremely noisy audio
    • Rare dialectal variations
    • Domain-specific jargon
  • Code-switching handling is strong but not perfect

  • Limited publicly available evaluation benchmarks for Darija


Ethical Considerations

  • The dataset is curated with attention to quality and representativeness, but may still contain biases.

  • Users should:

    • Validate outputs in critical applications
    • Be mindful of potential transcription inaccuracies
    • Avoid misuse in surveillance or harmful contexts

Usage

Install (Python)

uv pip install qwen_asr

Inference (PyTorch / CUDA)

from qwen_asr import Qwen3ASRModel

model = Qwen3ASRModel.from_pretrained(
    "atlasia/moulsot.v0.3",
    dtype="bfloat16",
    device_map="cuda",
)

result = model.transcribe(audio="your_audio.wav", language="Arabic")
print(result.text)

MLX (Apple Silicon)

Install:

uv pip install mlx-audio

Run:

from mlx_audio.stt.utils import load

model = load("atlasia/moulsot.v0.3")
audio_path = "your_audio.wav"

transcription = model.generate(audio_path).text
print(transcription)

Citation

@misc{moulsot2026,
  title={MoulSot v0.3: The New Champ of Darija ASR},
  author={Atlasia},
  year={2026},
  url={https://huggingface.co/atlasia/moulsot.v0.3}
}

Resources & Roadmap

Coming soon:

  • Full technical blog
  • Dataset release
  • Training details
  • Future model versions

Stay tuned via: πŸ‘‰ https://atlasia.ma


Acknowledgements

Developed by Atlasia as part of ongoing efforts to advance AI for Moroccan languages and dialects.

License

Apache 2.0 β€” same as the base model.

Downloads last month
268
Safetensors
Model size
2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for atlasia/moulsot.v0.3

Finetuned
(47)
this model

Dataset used to train atlasia/moulsot.v0.3

Space using atlasia/moulsot.v0.3 1