NCERT Tutor (Classes 6–8)
Model Summary
NCERT Tutor (Classes 6–8) is an instruction-tuned language model designed to answer questions strictly based on NCERT textbooks for Classes 6, 7, and 8.
It is optimized for school-level Maths and Science explanations using simple, student-friendly language aligned with NCERT standards.
The model is suitable for:
- NCERT-based doubt solving
- Concept explanations
- Educational chatbots
- Classroom assistance tools
Model Details
Model Description
- Developed by: Priyansh Agarwal
- Model type: Instruction-tuned causal language model
- Language(s): English
- Domain: School education (NCERT – India)
- Finetuned from: Base LLM using LoRA fine-tuning
- License: Apache 2.0 (inherits base model license)
The model was fine-tuned on a custom-generated dataset consisting of NCERT-style question–answer pairs derived from official NCERT textbook content.
Intended Uses
Direct Use
- Ask questions based on NCERT Class 6–8 Maths & Science
- Generate textbook-style explanations
- Build an NCERT tutor chatbot
- Educational demos and learning tools
Example prompt:
What is photosynthesis? (According to NCERT Class 7 Science)
Downstream Use
Further fine-tuning for:
Class 9–10 NCERT
Subject-specific tutors
Regional language extensions
Integration into:
Streamlit apps
Mobile education apps
School LMS platforms
Out-of-Scope Use
This model is NOT suitable for:
Competitive exam preparation (JEE, NEET, Olympiads)
Medical, legal, or professional advice
Content outside NCERT syllabus
Creative writing or open-ended reasoning
Dataset Information
Training Data
Source: NCERT textbook content (Classes 6–8)
Subjects:
Mathematics
Science
Format: Instruction–response pairs
Style: NCERT-aligned, factual, syllabus-bound answers
Example data format:
{
"instruction": "What is the perimeter of a rectangle? (According to NCERT Class 6 Maths)",
"input": "",
"output": "The perimeter of a rectangle is the sum of the lengths of all its sides."
}
⚠️ The dataset was manually cleaned to remove:
Multi-question outputs
Hallucinated content
Non-NCERT explanations
Incomplete answers
Training Details
Training Procedure
Fine-tuning method: LoRA (Low-Rank Adaptation)
Training framework: Hugging Face transformers + trl
Training type: Supervised Fine-Tuning (SFT)
Training Hyperparameters
Precision: fp16
Epochs: 3
Optimizer: AdamW
Loss: Cross-entropy (causal LM)
Training Metrics
Final training loss: ~1.8
Hardware: NVIDIA GPU (Google Colab)
Training time: ~45 minutes
Evaluation
Evaluation Method
Manual qualitative evaluation
Checked for:
NCERT alignment
Factual correctness
Language simplicity
Hallucination control
Results Summary
✅ Accurate NCERT-style answers
✅ Simple, student-friendly explanations
⚠️ Occasionally verbose for very short questions
⚠️ Performance depends on how well prompts specify class & subject
Bias, Risks, and Limitations
Limitations
Limited to Classes 6–8
English-only
Does not cite textbook page numbers
Can produce extra text if prompt format is unclear
Recommendations
Users should:
Explicitly mention Class and Subject in prompts
Avoid asking questions outside NCERT syllabus
Use as a learning aid, not a replacement for textbooks
How to Use the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/NCERT-Tutor-6-8")
model = AutoModelForCausalLM.from_pretrained(
"YOUR_USERNAME/NCERT-Tutor-6-8",
device_map="auto"
)
prompt = "Explain photosynthesis. (According to NCERT Class 7 Science)"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Environmental Impact
Hardware: NVIDIA GPU (Cloud)
Cloud Provider: Google Colab
Compute Region: Unspecified
Estimated Carbon Emissions: Low (short-duration fine-tuning)
Model Card Authors
Priyansh Agarwal
Contact
For feedback, improvements, or collaboration:
GitHub / Hugging Face: Add your profile links here
Email: Optional
Acknowledgements
NCERT textbooks
Hugging Face 🤗 ecosystem
TRL (Transformer Reinforcement Learning) library