MoulSot v0.3 — Automatic Speech Recognition for Moroccan Darija

Model Description

MoulSot v0.3 is an Automatic Speech Recognition (ASR) model developed by Atlasia for Moroccan Darija, designed to handle real-world speech with strong robustness to code-switching (Darija ↔ French ↔ Arabic).

The model is trained on a large, curated dataset and aims to provide high-quality transcription for conversational, media, and user-generated audio in Moroccan dialect.

Model type: End-to-end ASR
Language(s): Moroccan Darija (primary), with code-switching (French, English, Arabic)
Version: v0.3
Repository: https://huggingface.co/atlasia/moulsot.v0.3
Data pipeline: https://github.com/atlasia-ma/MoulSot

Key Features

🔊 1,500 hours of curated Darija audio
🏆 80 hours of gold-standard transcriptions
⚡ Native support for code-switching
🎙️ Designed for real-world noisy and conversational speech

Training Data

Overview

The MoulSot dataset is built from a large-scale data pipeline focused on collecting, filtering, and curating Moroccan Darija audio.

Total audio: ~1,500 hours
High-quality labeled subset: 80 hours
Speech characteristics:
- Spontaneous and conversational
- Multi-domain (media, informal speech, etc.)
- Code-switching between Darija, English, French, and Modern Standard Arabic

Data Pipeline

The full data processing pipeline is open-sourced: 👉 https://github.com/atlasia-ma/MoulSot

It includes:

Audio collection and aggregation
Cleaning and normalization
Segmentation
Annotation workflows
Quality filtering

Training Procedure

While full technical details are released progressively, the model training includes:

Supervised training on high-quality labeled data (80h)
Optimization for multilingual/code-switched contexts
Iterative refinement across versions

More details will be available in the technical blog.

Intended Use

Primary Use Cases

Transcription of Moroccan Darija audio
Voice interfaces and assistants
Media and content indexing
Call center and conversational AI
Speech analytics in Moroccan context

Out-of-Scope Use

Critical decision-making systems without human validation
Languages outside Darija/English/French/Arabic code-switching context
Highly specialized domains without adaptation

Performance

MoulSot v0.3 is positioned as a state-of-the-art Darija ASR system, with strong qualitative performance in:

Conversational fluency
Handling mixed-language speech
Robustness to accents and informal usage

Benchmark Comparison

Results (↓ lower is better):

MoulSot v0.1 — WER: 66.78 / CER: 21.30
MoulSot v0.3 — WER: 38.99 / CER: 12.58
ISMA — WER: 39.20 / CER: 13.47

MoulSot v0.3 significantly improves over v0.1 and achieves state-of-the-art performance, slightly outperforming ISMA on both WER and CER.

Leaderboard Ranking

MoulSot v0.3 ranks #1 on the Moroccan Darija ASR leaderboard:

👉 https://huggingface.co/spaces/abdeljalilELmajjodi/moroccan_darija_asr_leaderboard

Key takeaways:

🥇 Ranked 1st place among evaluated Darija ASR models
📊 Confirms strong real-world performance beyond internal benchmarks
🔍 Validates improvements over previous versions and competing systems

Quantitative benchmarks and detailed evaluations will be expanded in the technical report.

Limitations

Performance may degrade on:
- Extremely noisy audio
- Rare dialectal variations
- Domain-specific jargon
Code-switching handling is strong but not perfect
Limited publicly available evaluation benchmarks for Darija

Ethical Considerations

The dataset is curated with attention to quality and representativeness, but may still contain biases.
Users should:
- Validate outputs in critical applications
- Be mindful of potential transcription inaccuracies
- Avoid misuse in surveillance or harmful contexts

Usage

Install (Python)

uv pip install qwen_asr

Inference (PyTorch / CUDA)

from qwen_asr import Qwen3ASRModel

model = Qwen3ASRModel.from_pretrained(
    "atlasia/moulsot.v0.3",
    dtype="bfloat16",
    device_map="cuda",
)

result = model.transcribe(audio="your_audio.wav", language="Arabic")
print(result.text)

MLX (Apple Silicon)

Install:

uv pip install mlx-audio

Run:

from mlx_audio.stt.utils import load

model = load("atlasia/moulsot.v0.3")
audio_path = "your_audio.wav"

transcription = model.generate(audio_path).text
print(transcription)

Citation

@misc{moulsot2026,
  title={MoulSot v0.3: The New Champ of Darija ASR},
  author={Atlasia},
  year={2026},
  url={https://huggingface.co/atlasia/moulsot.v0.3}
}

Resources & Roadmap

Coming soon:

Full technical blog
Dataset release
Training details
Future model versions

Stay tuned via: 👉 https://atlasia.ma

Acknowledgements

Developed by Atlasia as part of ongoing efforts to advance AI for Moroccan languages and dialects.

License

Apache 2.0 — same as the base model.

Downloads last month: 268

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for atlasia/moulsot.v0.3

Base model

Qwen/Qwen3-ASR-1.7B

Finetuned

(47)

this model

atlasia
/

moulsot.v0.3