๐ŸŒพ Agricultural Advisory LLM โ€” Llama 3.1 8B (Pakistan)

A LoRA fine-tuned version of Meta-Llama-3.1-8B-Instruct specialized for Pakistani crop farming advisory. The model answers general crop questions and interprets field sensor data (NDVI, EVI, NDWI, temperature, humidity) to provide concise, actionable farm advisories.


Model Details

  • Base model: meta-llama/Meta-Llama-3.1-8B-Instruct (4-bit quantized via Unsloth)
  • Fine-tuning method: LoRA (rank 16, alpha 16, RSLoRA enabled)
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Trainable parameters: ~0.53% of total
  • Hardware: NVIDIA Tesla T4 (14.56 GB VRAM)
  • Training time: ~36 minutes (2203s)
  • Peak VRAM: 14.28 GB

Training Details

Dataset

  • General Q&A: Synthetic agricultural advisories covering crops, topics, and questions relevant to Pakistani farming conditions
  • Farm-specific: Sensor-based advisories using field readings (NDVI, EVI, SAVI, MSAVI, NDWI, GNDVI, temperature, humidity, etc.)
  • Total examples: 4,730 mixed and shuffled records
  • Packed examples: ~335 per epoch (via Unsloth sequence packing)

Hyperparameters

Parameter Value
Epochs 2
Learning rate 1e-4
LR scheduler Cosine
Warmup ratio 0.1
Batch size (per device) 2
Gradient accumulation steps 4
Effective batch size 8
Weight decay 0.05
Max grad norm 0.3
Optimizer AdamW 8-bit
Precision bf16
Max sequence length 2048
Packing Enabled

Training Loss

Step Loss
5 3.4473
10 1.6792
15 0.8774
20 0.7404
25 0.6289
30 0.6021
35 0.6064
40 0.5738
45 0.5489

Final step loss settled at 0.55, indicating solid generalization without overfitting.


Usage

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name    = "your-hf-username/finetuned-Llama-3.1-8B-Instruct",
    max_seq_length= 2048,
    dtype         = None,
    load_in_4bit  = True,
)
FastLanguageModel.for_inference(model)

SYSTEM_PROMPT = (
    "You are an expert agricultural advisor specializing in Pakistani crop farming. "
    "You can answer general crop questions and also interpret field sensor data "
    "(NDVI, EVI, NDWI, temperature, humidity, etc.) to provide precise farm advisories. "
    "Answer accurately and concisely based on official recommendations and best practices. "
    "Keep answers under 3 sentences. Do not include citations, URLs, or markdown headers. "
    "Answer directly and stop."
)

def ask(crop, question, topic="General"):
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": f"[Crop: {crop} | Topic: {topic}]\n{question}"},
    ]
    inputs = tokenizer.apply_chat_template(
        messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
    ).to("cuda")

    with torch.no_grad():
        out = model.generate(
            input_ids=inputs, max_new_tokens=150,
            use_cache=True, temperature=0.7, top_p=0.9,
            repetition_penalty=1.1,
            pad_token_id=tokenizer.eos_token_id,
        )
    return tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True).strip()

print(ask("Maize", "How much seed is required per acre?"))
# โ†’ 50-60 kg per acre for good stand at 40-45 thousand plants per acre.

Farm Sensor Advisory

def ask_farm(crop, stage, sensors: dict):
    sensor_str = "\n".join(f"{k}: {v}" for k, v in sensors.items())
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": (
            f"[Crop: {crop} | Stage: {stage}]\n"
            f"Field sensor readings:\n{sensor_str}\n\n"
            f"Provide a detailed farm advisory based on these readings."
        )},
    ]
    inputs = tokenizer.apply_chat_template(
        messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
    ).to("cuda")

    with torch.no_grad():
        out = model.generate(
            input_ids=inputs, max_new_tokens=200,
            use_cache=True, temperature=0.7, top_p=0.9,
            repetition_penalty=1.1,
            pad_token_id=tokenizer.eos_token_id,
        )
    return tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True).strip()

print(ask_farm("Cotton", "Boll Formation", {"NDVI": 0.38, "temperature_c": 34, "relative_humidity": 55}))

Sample Outputs

Crop Question Answer
Maize Which varieties are high-yielding in Pakistan? SH-32, PKV-1, Khyber-2002, and Pehlu-6 are high-yielding, disease-resistant hybrids widely grown across Pakistan.
Maize How much seed per acre? 50-60 kg per acre for good stand at 40-45 thousand plants per acre.
Maize NDVI 0.42 โ€” is that healthy? Moderate stress. Look for uniformity โ€” a hotspot indicates disease or pest issue.
Cotton Pesticide for whitefly? Use neonicotinoid seed treatments or foliar applications of imidacloprid or acetamiprid. Practice good sanitation and remove weeds that harbor nymphs.

Comparison with Qwen2.5-7B Fine-tune

Metric Llama 3.1 8B Qwen 2.5 7B
Final step loss 0.55 0.63
Training time 36 min 31 min
Peak VRAM 14.28 GB 12.02 GB
Epochs 2 1
Answer style Concise + actionable Concise + technical

Limitations

  • Trained on synthetic data โ€” real-world agronomic validation recommended before deployment
  • Pakistan-specific; recommendations may not transfer to other regions
  • Sensor advisory accuracy depends on data quality and crop stage alignment
  • VRAM usage is near T4 ceiling โ€” do not increase batch size without gradient checkpointing
  • Not a substitute for consultation with local agricultural extension services

Authors

Developed for the AgroBot-Research project.

Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for shehryars715/finetuned-Llama-3.1-8B-Instruct