Llama-3.2-3B-Instruct-Mongolian
A LoRA fine-tuned version of meta-llama/Llama-3.2-3B-Instruct for Mongolian language instruction-following and chat.
Model Description
This model adapts Llama 3.2 3B Instruct to understand and generate fluent Mongolian text. It was fine-tuned using LoRA (Low-Rank Adaptation) on the saillab/alpaca-mongolian-cleaned dataset containing ~41,600 Mongolian instruction-following examples.
The base model struggles with Mongolian, producing garbled or incoherent text. After fine-tuning, the model generates fluent, coherent Mongolian responses across a wide range of topics.
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
BASE_MODEL = "meta-llama/Llama-3.2-3B-Instruct"
ADAPTER = "munkhbayar-batkhuu/Llama-3.2-3B-Instruct-Mongolian"
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL, torch_dtype=torch.float16, device_map="auto"
)
model = PeftModel.from_pretrained(model, ADAPTER)
model.eval()
messages = [
{"role": "system", "content": "You are a helpful assistant that responds in Mongolian."},
{"role": "user", "content": "Монгол улсын нийслэл хаана байдаг вэ?"},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9, do_sample=True)
response = tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)
# Output: Монгол улсын нийслэл нь Улаанбаатар хот юм.
Training Details
| Parameter | Value |
|---|---|
| Base Model | meta-llama/Llama-3.2-3B-Instruct |
| Method | LoRA (PEFT) |
| Dataset | saillab/alpaca-mongolian-cleaned (~41,600 examples) |
| Train/Eval Split | 39,520 / 2,081 (95/5, seed=42) |
| LoRA Rank | 32 |
| LoRA Alpha | 64 |
| LoRA Dropout | 0.05 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Trainable Parameters | 48.6M (1.49% of 3.26B) |
| Epochs | 3 |
| Batch Size | 4 (x4 gradient accumulation = effective 16) |
| Learning Rate | 2e-4 (cosine schedule) |
| Warmup | 5% of steps |
| Precision | float16 |
| Max Sequence Length | 512 |
| Training Steps | 7,410 |
| Training Time | ~130.3 hours |
| Final Train Loss | 0.671 |
| Final Eval Loss | 0.628 |
| Token Accuracy | 83% |
Chat Template
Training used the Llama 3.2 chat template with system prompt: "You are a helpful assistant that responds in Mongolian."
Benchmark Results
MM-Eval (Mongolian Multi-task Evaluation)
MM-Eval (arXiv:2411.09492) is a hierarchical benchmark for evaluating LLMs on Mongolian language tasks across 1,840 items.
| Category | Items | Base Model | Fine-tuned | Delta |
|---|---|---|---|---|
| Syntax | 569 (MCQ) | 26.89% | 35.33% | +8.44% |
| Semantics | 677 (MCQ) | 27.47% | 37.37% | +9.90% |
| Knowledge | 344 (MCQ) | 32.85% | 67.44% | +34.59% |
| Reasoning | 250 (numeric) | 3.20% | 0.80% | -2.40% |
Perplexity (on eval split, 2,081 samples)
| Model | Perplexity | Avg Loss |
|---|---|---|
| Base | 18.31 | 2.9075 |
| Fine-tuned | 1.99 | 0.6881 |
BLEU / ROUGE-L (200 eval samples)
| Metric | Base Model | Fine-tuned | Improvement |
|---|---|---|---|
| BLEU-1 | 0.1373 | 0.3245 | +136% |
| BLEU-2 | 0.0427 | 0.1281 | +200% |
| ROUGE-L | 0.0760 | 0.1874 | +147% |
Example Outputs
Prompt: "Монгол улсын нийслэл хаана байдаг вэ?" (Where is the capital of Mongolia?)
| Model | Response |
|---|---|
| Base | Улсын нийслэл хаан байдаг байна. Улсын нийслэл нь Улаанбаatar... (garbled) |
| Fine-tuned | Монгол улсын нийслэл нь Улаанбаатар хот юм. (The capital of Mongolia is Ulaanbaatar.) |
Prompt: "Хүүхдэд зориулж богино үлгэр бичнэ үү." (Write a short story for children.)
| Model | Response |
|---|---|
| Base | Хүүхдэд зориулж богино үлгэр бичихээс бээрхээр эрхэмтгээр өрөмддэг... (incoherent) |
| Fine-tuned | Эрт урьдын цагт өтгөн ногоон ойд Жак хэмээх бяцхан хөгшин байжээ... (Long ago in a thick green forest lived an old man named Jack...) |
Prompt: "Монгол хоолны жор бичнэ үү." (Write a Mongolian food recipe.)
| Model | Response |
|---|---|
| Base | Холны жор хэлээр байна. Хэрээрт хэрээрт бүртгэлтэй... (incoherent) |
| Fine-tuned | Миний сонгосон жор: Гоймонгийн амтат чинжүү. Орц: 12 ширхэг улаан чинжүү, 1/4 аяга цагаан элсэн чихэр... (Recipe with ingredients list) |
Limitations
- Domain: Trained on general instruction-following data; may not perform well on specialized domains (medical, legal, technical)
- Math/Reasoning: Mathematical reasoning did not improve (slightly declined on MM-Eval reasoning)
- Hallucination: Like all LLMs, may generate plausible but factually incorrect information
- Sequence Length: Trained with max 512 tokens; may degrade on longer inputs
- Model Size: 3B parameters -- larger models would likely achieve better results
Framework Versions
- PEFT: 0.18.1
- TRL: 0.27.2
- Transformers: 5.1.0
- PyTorch: 2.10.0+cu128
- Datasets: 4.5.0
Citation
If you use this model, please cite:
@misc{llama32-3b-mongolian-2026,
title={Llama-3.2-3B-Instruct-Mongolian},
author={Munkhbayar Batkhuu},
year={2026},
url={https://huggingface.co/munkhbayar-batkhuu/Llama-3.2-3B-Instruct-Mongolian}
}
Acknowledgments
- Downloads last month
- 15
Model tree for munkhbayar-batkhuu/Llama-3.2-3B-Instruct-Mongolian
Base model
meta-llama/Llama-3.2-3B-InstructDataset used to train munkhbayar-batkhuu/Llama-3.2-3B-Instruct-Mongolian
Paper for munkhbayar-batkhuu/Llama-3.2-3B-Instruct-Mongolian
Evaluation results
- Accuracy on MM-Eval Syntaxself-reported35.330
- Accuracy on MM-Eval Semanticsself-reported37.370
- Accuracy on MM-Eval Knowledgeself-reported67.440
- Accuracy on MM-Eval Reasoningself-reported0.800
- Perplexity on Alpaca Mongolian (eval split)self-reported1.990