PK-Genesis MedGemma 4B β Pharmaceutical AI
PK-Genesis is a domain-specific pharmaceutical AI built by PharmKulen, fine-tuned from Google's MedGemma 4B IT using QLoRA on 70,000+ pharmacy-specific training records.
This is Run 2 of 10 in the curriculum training pipeline. The model improves with each successive run.
What It Does
PK-Genesis is designed to assist pharmacists and patients with:
- Drug Information β dosage, indications, contraindications, side effects
- Drug Interactions β checking safety of medication combinations
- Prescription Understanding β explaining prescriptions in simple terms
- Patient Counseling β medication guidance in plain language
- Clinical Reasoning β step-by-step analysis of pharmaceutical scenarios
- Safety Awareness β recognizing emergencies and scope limitations
- Multilingual Support β English, Chinese (δΈζ), Khmer (ααααα), French, Russian, Korean
Training Details
Base Model
- Model: MedGemma 4B IT (via unsloth/medgemma-4b-it)
- Architecture: Gemma 3 β 4B parameters, multimodal (vision + text)
- Quantization: 4-bit NF4 (QLoRA)
Fine-Tuning Method
- Method: QLoRA (Quantized Low-Rank Adaptation)
- Framework: Unsloth FastVisionModel + HuggingFace PEFT + TRL SFTTrainer
- LoRA Rank: 32
- LoRA Alpha: 64
- Target Modules: All linear layers (attention + MLP)
- Trainable Parameters: ~65.5M (out of 4B total)
- Sequence Length: 1024 tokens
- Optimizer: AdamW 8-bit
- Gradient Checkpointing: Enabled
Curriculum Training (10 Runs)
PK-Genesis uses curriculum learning β training data is organized from general to specialized, with decreasing learning rates:
| Run | Data | Records | LR | Status |
|---|---|---|---|---|
| 1 | EN core pharmacy (part 1) | ~9.6K | 2e-4 | Completed |
| 2 | EN core pharmacy (part 2) | ~9.6K | 2e-4 | Completed |
| 3 | EN extended (FDA/WHO/RxNorm, part 1) | ~9.2K | 1.5e-4 | Pending |
| 4 | EN extended (part 2) | ~9.2K | 1.5e-4 | Pending |
| 5 | EN extended (part 3) | ~9.2K | 1.5e-4 | Pending |
| 6 | EN extended (part 4) | ~9.2K | 1.5e-4 | Pending |
| 7 | Multilingual ZH/FR/RU/KO (part 1) | ~7.8K | 1e-4 | Pending |
| 8 | Multilingual (part 2) | ~7.8K | 1e-4 | Pending |
| 9 | Multilingual (part 3) | ~7.8K | 1e-4 | Pending |
| 10 | Khmer + Identity + Safety | ~5.5K | 5e-5 | Pending |
Anti-forgetting: 20% English core data replayed in batches 2-4 to prevent catastrophic forgetting.
Training Data (70,000+ Records)
| Source | Records | Description |
|---|---|---|
| EN Core Pharmacy | ~19.2K | Drug monographs, interactions, dosing, counseling |
| OpenFDA | ~29.9K | FDA adverse events, drug labels, recalls |
| WHO Essential Medicines | 553 | WHO model list with clinical guidance |
| RxNorm | 1,309 | Drug nomenclature and relationships |
| Chain-of-Thought Reasoning | ~2K | Step-by-step clinical reasoning scenarios |
| Safety & Disclaimers | ~500 | Refusal patterns, emergency recognition, scope awareness |
| Identity | ~1.5K | PK-Genesis identity and personality |
| Multilingual (ZH/FR/RU/KO) | ~23.3K | Translated pharmacy knowledge |
| Khmer Pharmacy | ~5K | Cambodia-specific pharmaceutical data |
All training data is in chat/messages format compatible with the Gemma 3 chat template.
Usage
Requirements
pip install transformers peft torch accelerate bitsandbytes
Loading the Model
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
base_model = AutoModelForCausalLM.from_pretrained(
"unsloth/medgemma-4b-it",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("unsloth/medgemma-4b-it")
# Load PK-Genesis adapter
model = PeftModel.from_pretrained(base_model, "pharmkulen/pk-genesis-medgemma-run03")
model.eval()
Chat Example
messages = [
{"role": "user", "content": "What are the common side effects of metformin?"}
]
inputs = tokenizer.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True,
return_tensors="pt", return_dict=True
).to(model.device)
# Gemma 3 requires token_type_ids
inputs["token_type_ids"] = torch.zeros_like(inputs["input_ids"])
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
With Unsloth (Faster Inference)
from unsloth import FastVisionModel
model, tokenizer = FastVisionModel.from_pretrained(
"unsloth/medgemma-4b-it",
load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "pharmkulen/pk-genesis-medgemma-run03")
FastVisionModel.for_inference(model)
Intended Use
Primary Use Cases
- Pharmacy management systems (drug info lookup, interaction checking)
- Patient-facing medication counseling chatbots
- Pharmacist decision support tools
- Medical education and training aids
- Multilingual pharmacy assistance in Southeast Asia
Out of Scope
- Not for clinical diagnosis β PK-Genesis is a pharmacy assistant, not a diagnostic tool
- Not a replacement for healthcare professionals β always consult qualified pharmacists/doctors
- Not validated for life-critical decisions β do not rely on this model for emergency medical decisions
Limitations & Safety
- This model may produce incorrect or outdated drug information
- Always verify critical medical information with official sources (FDA, WHO, local formularies)
- The model will attempt to refuse diagnostic requests and redirect to professionals
- Khmer language performance is still limited at Run 3 (improves in Runs 7-10)
- Vision capabilities (medicine label reading) are not yet fine-tuned
All Checkpoints
| Run | HuggingFace Repository |
|---|---|
| Run 1 | pharmkulen/pk-genesis-medgemma-run01 |
| Run 2 | pharmkulen/pk-genesis-medgemma-run02 (this) |
| Run 3 | **pharmkulen/pk-genesis-medgemma-run03 |
| Run 4-10 | Coming soon |
About PharmKulen
PharmKulen is an AI-powered pharmacy management and medicine search platform serving 120+ pharmacies across Cambodia. We help patients find medicines at nearby pharmacies with real-time availability in 6 languages, and provide pharmacy owners with digital tools for inventory, sales, and AI-assisted operations.
Contact: contact@pharmkulen.com Website: pharmkulen.com
Citation
@misc{pk-genesis-medgemma-2026,
title={PK-Genesis: Domain-Specific Pharmaceutical AI Fine-Tuned from MedGemma 4B},
author={Salakhitdinov, Khidayotullo},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/pharmkulen/pk-genesis-medgemma-run02}
}
- Downloads last month
- 13
Model tree for pharmkulen/pk-genesis-medgemma-run02
Base model
google/gemma-3-4b-pt