Update model card: merge full GGUF card + lalatendu card into unified README
Browse files
README.md
CHANGED
|
@@ -21,59 +21,96 @@ pipeline_tag: text-generation
|
|
| 21 |
**phi3-sysadmin-lalatendu** is a domain-specialized model based on [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct), fine-tuned using **QLoRA (SFT with LoRA)** via [Unsloth](https://github.com/unslothai/unsloth) for Linux system administration and DevOps tasks. This repository provides the **GGUF (Q4_K_M)** quantized model ready for local inference via [Ollama](https://ollama.com).
|
| 22 |
|
| 23 |
- **Developed by:** [Lalatendu Keshari Swain](https://lalatendu.info)
|
|
|
|
|
|
|
|
|
|
| 24 |
- **Base model:** [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) (3.8B parameters)
|
| 25 |
- **Fine-tuning method:** QLoRA (Supervised Fine-Tuning with LoRA)
|
| 26 |
-
- **
|
| 27 |
|
| 28 |
> **Disclaimer:** This model is provided for educational and productivity purposes only. We take no responsibility for the accuracy or completeness of the outputs. Commands and configurations suggested by this model should always be verified by a qualified system administrator before being applied to any production system. Please use it at your own risk.
|
| 29 |
|
| 30 |
---
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
## Training Process
|
| 33 |
|
| 34 |
This model was trained using a single-stage SFT process:
|
| 35 |
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
- **
|
|
|
|
| 39 |
- **Topics:** Linux administration, AWS, Docker, Kubernetes, Terraform, Ansible, Nginx, databases, networking, security, monitoring, backup
|
| 40 |
- **Objective:** To specialize the Phi-3 Mini model in answering practical server management and troubleshooting questions accurately and concisely.
|
| 41 |
|
| 42 |
-
|
| 43 |
|
| 44 |
| Parameter | Value |
|
| 45 |
|-----------|-------|
|
| 46 |
| Base model quantization | 4-bit (bnb-4bit) |
|
| 47 |
| LoRA rank (r) | 64 |
|
| 48 |
| LoRA alpha | 128 |
|
|
|
|
| 49 |
| Trainable parameters | ~119M (5.62% of total) |
|
| 50 |
| Epochs | 3–5 |
|
| 51 |
| Batch size | 8 |
|
| 52 |
| Learning rate | 2e-4 |
|
| 53 |
| Optimizer | AdamW (8-bit) |
|
|
|
|
|
|
|
| 54 |
| LR scheduler | Linear |
|
| 55 |
| Training time | ~6 minutes |
|
| 56 |
-
| GPU | NVIDIA T4 (Google Colab) |
|
| 57 |
| Final training loss | ~0.5–0.8 |
|
| 58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
---
|
| 60 |
|
| 61 |
-
## How to
|
| 62 |
|
| 63 |
### Option 1: Ollama (Recommended)
|
| 64 |
|
| 65 |
```bash
|
| 66 |
-
#
|
| 67 |
-
ollama
|
|
|
|
|
|
|
| 68 |
|
| 69 |
-
#
|
| 70 |
ollama create phi3-sysadmin -f Modelfile
|
|
|
|
|
|
|
| 71 |
ollama run phi3-sysadmin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
-
|
| 74 |
curl http://localhost:11434/api/generate -d '{
|
| 75 |
"model": "phi3-sysadmin",
|
| 76 |
-
"prompt": "How do I
|
| 77 |
"stream": false
|
| 78 |
}'
|
| 79 |
```
|
|
@@ -113,37 +150,69 @@ outputs = model.generate(**inputs, max_new_tokens=512)
|
|
| 113 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 114 |
```
|
| 115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
---
|
| 117 |
|
| 118 |
## Use Cases
|
| 119 |
|
| 120 |
| Supported | Not Supported |
|
| 121 |
|-----------|--------------|
|
| 122 |
-
| Linux administration (disk, CPU, memory, processes) | General-purpose conversation or creative writing |
|
| 123 |
| Cloud platforms (AWS, Azure, GCP) | Medical, legal, or financial advice |
|
| 124 |
-
| Containers (Docker, Kubernetes, Podman) | Non-English language tasks |
|
| 125 |
| CI/CD (Jenkins, GitHub Actions, ArgoCD) | Real-time data or internet access |
|
| 126 |
| IaC (Terraform, Ansible, Packer) | Unauthorized penetration testing or malicious use |
|
| 127 |
-
| Web servers (Nginx, Apache) | |
|
| 128 |
-
| Databases (MySQL, PostgreSQL, Redis, Elasticsearch) | |
|
| 129 |
-
|
|
| 130 |
-
|
|
|
|
|
|
|
|
|
|
|
| 131 |
|
| 132 |
---
|
| 133 |
|
| 134 |
## Bias, Risks, and Limitations
|
| 135 |
|
| 136 |
-
- **Small model (3.8B):** May occasionally hallucinate
|
| 137 |
-
- **Training data scope:** 1,026 examples cover common topics
|
| 138 |
- **English only:** All responses are in English.
|
| 139 |
- **No real-time access:** Cannot check current documentation, package versions, or live system state.
|
| 140 |
-
- **Outdated information:** Package names, versions, and best practices evolve
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 141 |
|
| 142 |
---
|
| 143 |
|
| 144 |
## Training Data
|
| 145 |
|
| 146 |
-
The model was fine-tuned on
|
| 147 |
|
| 148 |
- Linux administration (disk, CPU, memory, processes, users, filesystems)
|
| 149 |
- Cloud platforms (AWS EC2, S3, VPC, IAM, RDS, CloudWatch, Lambda, EKS)
|
|
@@ -156,6 +225,21 @@ The model was fine-tuned on a curated dataset of 1,026 sysadmin Q&A pairs coveri
|
|
| 156 |
- Security (SSL/TLS, SELinux, AppArmor, vulnerability scanning)
|
| 157 |
- Monitoring (Prometheus, Grafana, Zabbix, ELK)
|
| 158 |
- Backup (BorgBackup, Restic, snapshots)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 159 |
|
| 160 |
---
|
| 161 |
|
|
@@ -164,12 +248,47 @@ The model was fine-tuned on a curated dataset of 1,026 sysadmin Q&A pairs coveri
|
|
| 164 |
| Item | Value |
|
| 165 |
|------|-------|
|
| 166 |
| Hardware | NVIDIA T4 GPU (16GB VRAM) |
|
| 167 |
-
| Training duration | ~6 minutes |
|
| 168 |
-
| Cloud provider | Google Colab |
|
|
|
|
| 169 |
| Estimated CO₂ | ~0.01 kg CO₂eq |
|
| 170 |
|
| 171 |
---
|
| 172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 173 |
## Citation
|
| 174 |
|
| 175 |
```bibtex
|
|
@@ -182,8 +301,15 @@ The model was fine-tuned on a curated dataset of 1,026 sysadmin Q&A pairs coveri
|
|
| 182 |
}
|
| 183 |
```
|
| 184 |
|
|
|
|
|
|
|
|
|
|
| 185 |
---
|
| 186 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 187 |
## Contact
|
| 188 |
|
| 189 |
| Channel | Link |
|
|
|
|
| 21 |
**phi3-sysadmin-lalatendu** is a domain-specialized model based on [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct), fine-tuned using **QLoRA (SFT with LoRA)** via [Unsloth](https://github.com/unslothai/unsloth) for Linux system administration and DevOps tasks. This repository provides the **GGUF (Q4_K_M)** quantized model ready for local inference via [Ollama](https://ollama.com).
|
| 22 |
|
| 23 |
- **Developed by:** [Lalatendu Keshari Swain](https://lalatendu.info)
|
| 24 |
+
- **Model type:** Causal Language Model (GGUF quantized)
|
| 25 |
+
- **Language(s):** English
|
| 26 |
+
- **License:** MIT
|
| 27 |
- **Base model:** [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) (3.8B parameters)
|
| 28 |
- **Fine-tuning method:** QLoRA (Supervised Fine-Tuning with LoRA)
|
| 29 |
+
- **Quantization:** q4_k_m (4-bit, ~2.3 GB)
|
| 30 |
|
| 31 |
> **Disclaimer:** This model is provided for educational and productivity purposes only. We take no responsibility for the accuracy or completeness of the outputs. Commands and configurations suggested by this model should always be verified by a qualified system administrator before being applied to any production system. Please use it at your own risk.
|
| 32 |
|
| 33 |
---
|
| 34 |
|
| 35 |
+
## Model Sources
|
| 36 |
+
|
| 37 |
+
- **LoRA Adapter:** [lalatendu/phi3-sysadmin-lora](https://huggingface.co/lalatendu/phi3-sysadmin-lora)
|
| 38 |
+
- **GitHub:** [github.com/lalatenduswain](https://github.com/lalatenduswain)
|
| 39 |
+
- **Blog:** [blog.lalatendu.info](https://blog.lalatendu.info)
|
| 40 |
+
|
| 41 |
+
---
|
| 42 |
+
|
| 43 |
## Training Process
|
| 44 |
|
| 45 |
This model was trained using a single-stage SFT process:
|
| 46 |
|
| 47 |
+
### Step 1: SFT (Supervised Fine-Tuning)
|
| 48 |
+
|
| 49 |
+
- **Dataset:** 1,026 curated sysadmin and DevOps Q&A examples in ChatML JSONL format
|
| 50 |
+
- **Format:** `system` / `user` / `assistant` turns
|
| 51 |
- **Topics:** Linux administration, AWS, Docker, Kubernetes, Terraform, Ansible, Nginx, databases, networking, security, monitoring, backup
|
| 52 |
- **Objective:** To specialize the Phi-3 Mini model in answering practical server management and troubleshooting questions accurately and concisely.
|
| 53 |
|
| 54 |
+
### Training Hyperparameters
|
| 55 |
|
| 56 |
| Parameter | Value |
|
| 57 |
|-----------|-------|
|
| 58 |
| Base model quantization | 4-bit (bnb-4bit) |
|
| 59 |
| LoRA rank (r) | 64 |
|
| 60 |
| LoRA alpha | 128 |
|
| 61 |
+
| LoRA target modules | Attention and MLP layers |
|
| 62 |
| Trainable parameters | ~119M (5.62% of total) |
|
| 63 |
| Epochs | 3–5 |
|
| 64 |
| Batch size | 8 |
|
| 65 |
| Learning rate | 2e-4 |
|
| 66 |
| Optimizer | AdamW (8-bit) |
|
| 67 |
+
| Warmup steps | 5 |
|
| 68 |
+
| Weight decay | 0.01 |
|
| 69 |
| LR scheduler | Linear |
|
| 70 |
| Training time | ~6 minutes |
|
| 71 |
+
| GPU | NVIDIA T4 (Google Colab free tier) |
|
| 72 |
| Final training loss | ~0.5–0.8 |
|
| 73 |
|
| 74 |
+
### GGUF Export
|
| 75 |
+
|
| 76 |
+
- **Quantization method:** q4_k_m via llama.cpp
|
| 77 |
+
- **File size:** ~2.3 GB
|
| 78 |
+
- **Export tool:** Unsloth's built-in GGUF exporter
|
| 79 |
+
|
| 80 |
---
|
| 81 |
|
| 82 |
+
## How to Get Started
|
| 83 |
|
| 84 |
### Option 1: Ollama (Recommended)
|
| 85 |
|
| 86 |
```bash
|
| 87 |
+
# 1. Install Ollama (if not already installed)
|
| 88 |
+
curl -fsSL https://ollama.com/install.sh | sh
|
| 89 |
+
|
| 90 |
+
# 2. Download phi3-sysadmin-Q4_K_M.gguf and Modelfile from this repo
|
| 91 |
|
| 92 |
+
# 3. Create the model
|
| 93 |
ollama create phi3-sysadmin -f Modelfile
|
| 94 |
+
|
| 95 |
+
# 4. Run interactively
|
| 96 |
ollama run phi3-sysadmin
|
| 97 |
+
```
|
| 98 |
+
|
| 99 |
+
**Example queries:**
|
| 100 |
+
|
| 101 |
+
```bash
|
| 102 |
+
ollama run phi3-sysadmin "How do I find what's consuming disk space?"
|
| 103 |
+
ollama run phi3-sysadmin "How do I set up Nginx reverse proxy with SSL?"
|
| 104 |
+
ollama run phi3-sysadmin "How do I troubleshoot high CPU usage?"
|
| 105 |
+
ollama run phi3-sysadmin "How do I create a Kubernetes deployment?"
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
**API usage:**
|
| 109 |
|
| 110 |
+
```bash
|
| 111 |
curl http://localhost:11434/api/generate -d '{
|
| 112 |
"model": "phi3-sysadmin",
|
| 113 |
+
"prompt": "How do I check which process is using port 8080?",
|
| 114 |
"stream": false
|
| 115 |
}'
|
| 116 |
```
|
|
|
|
| 150 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 151 |
```
|
| 152 |
|
| 153 |
+
### Modelfile Contents
|
| 154 |
+
|
| 155 |
+
```
|
| 156 |
+
FROM ./phi3-sysadmin-Q4_K_M.gguf
|
| 157 |
+
|
| 158 |
+
TEMPLATE """<|system|>
|
| 159 |
+
{{ .System }}<|end|>
|
| 160 |
+
<|user|>
|
| 161 |
+
{{ .Prompt }}<|end|>
|
| 162 |
+
<|assistant|>
|
| 163 |
+
{{ .Response }}<|end|>
|
| 164 |
+
"""
|
| 165 |
+
|
| 166 |
+
SYSTEM """You are phi3-sysadmin, a fine-tuned AI assistant created by Lalatendu Keshari Swain. Provide clear, practical answers for server management and troubleshooting."""
|
| 167 |
+
|
| 168 |
+
PARAMETER stop <|end|>
|
| 169 |
+
PARAMETER stop <|user|>
|
| 170 |
+
PARAMETER stop <|assistant|>
|
| 171 |
+
PARAMETER stop <|endoftext|>
|
| 172 |
+
PARAMETER temperature 0.7
|
| 173 |
+
PARAMETER top_p 0.9
|
| 174 |
+
```
|
| 175 |
+
|
| 176 |
---
|
| 177 |
|
| 178 |
## Use Cases
|
| 179 |
|
| 180 |
| Supported | Not Supported |
|
| 181 |
|-----------|--------------|
|
| 182 |
+
| Linux administration (disk, CPU, memory, processes, users, filesystems, systemd) | General-purpose conversation or creative writing |
|
| 183 |
| Cloud platforms (AWS, Azure, GCP) | Medical, legal, or financial advice |
|
| 184 |
+
| Containers (Docker, Kubernetes, Podman, Docker Swarm) | Non-English language tasks |
|
| 185 |
| CI/CD (Jenkins, GitHub Actions, ArgoCD) | Real-time data or internet access |
|
| 186 |
| IaC (Terraform, Ansible, Packer) | Unauthorized penetration testing or malicious use |
|
| 187 |
+
| Web servers (Nginx, Apache, Varnish) | |
|
| 188 |
+
| Databases (MySQL, PostgreSQL, Redis, MongoDB, Elasticsearch) | |
|
| 189 |
+
| Networking (DNS, firewalls, load balancing, VPN, TCP/IP, MTU) | |
|
| 190 |
+
| Security (SSL/TLS, SELinux, AppArmor, mTLS, vulnerability scanning) | |
|
| 191 |
+
| Monitoring (Prometheus, Grafana, Zabbix, node_exporter, ELK) | |
|
| 192 |
+
| Backup (BorgBackup, Restic, snapshots, disaster recovery) | |
|
| 193 |
+
| Bash/Shell scripting assistance | |
|
| 194 |
|
| 195 |
---
|
| 196 |
|
| 197 |
## Bias, Risks, and Limitations
|
| 198 |
|
| 199 |
+
- **Small model (3.8B):** May occasionally hallucinate or produce inaccurate commands. Always verify before running on production servers.
|
| 200 |
+
- **Training data scope:** 1,026 examples cover common sysadmin topics. Niche or cutting-edge tooling may not be well represented.
|
| 201 |
- **English only:** All responses are in English.
|
| 202 |
- **No real-time access:** Cannot check current documentation, package versions, or live system state.
|
| 203 |
+
- **Outdated information:** Package names, versions, and best practices evolve — cross-reference with official docs.
|
| 204 |
+
|
| 205 |
+
**Recommendations:**
|
| 206 |
+
- Always verify commands before running on production systems
|
| 207 |
+
- Cross-reference with official documentation for critical configurations
|
| 208 |
+
- Use as a learning aid and quick reference, not as the sole authority
|
| 209 |
+
- Do not use for security-critical decisions without expert verification
|
| 210 |
|
| 211 |
---
|
| 212 |
|
| 213 |
## Training Data
|
| 214 |
|
| 215 |
+
The model was fine-tuned on 1,026 curated sysadmin Q&A pairs covering:
|
| 216 |
|
| 217 |
- Linux administration (disk, CPU, memory, processes, users, filesystems)
|
| 218 |
- Cloud platforms (AWS EC2, S3, VPC, IAM, RDS, CloudWatch, Lambda, EKS)
|
|
|
|
| 225 |
- Security (SSL/TLS, SELinux, AppArmor, vulnerability scanning)
|
| 226 |
- Monitoring (Prometheus, Grafana, Zabbix, ELK)
|
| 227 |
- Backup (BorgBackup, Restic, snapshots)
|
| 228 |
+
- Model identity, creator information, and boundary/refusal examples
|
| 229 |
+
|
| 230 |
+
---
|
| 231 |
+
|
| 232 |
+
## Evaluation
|
| 233 |
+
|
| 234 |
+
- **Testing:** Manual evaluation with diverse sysadmin questions
|
| 235 |
+
- **Training loss:** Final loss of ~0.5–0.8
|
| 236 |
+
- **Qualitative assessment:** Responses checked for accuracy, practicality, and completeness
|
| 237 |
+
|
| 238 |
+
**Results:**
|
| 239 |
+
- Provides accurate, practical answers for common sysadmin and DevOps tasks
|
| 240 |
+
- Correctly identifies itself as phi3-sysadmin created by Lalatendu Keshari Swain
|
| 241 |
+
- Appropriately refuses off-topic, harmful, and out-of-scope requests
|
| 242 |
+
- Handles variations in question phrasing well
|
| 243 |
|
| 244 |
---
|
| 245 |
|
|
|
|
| 248 |
| Item | Value |
|
| 249 |
|------|-------|
|
| 250 |
| Hardware | NVIDIA T4 GPU (16GB VRAM) |
|
| 251 |
+
| Training duration | ~6 minutes (~0.1 hours) |
|
| 252 |
+
| Cloud provider | Google Colab (free tier) |
|
| 253 |
+
| Compute region | Variable (Google Colab assigned) |
|
| 254 |
| Estimated CO₂ | ~0.01 kg CO₂eq |
|
| 255 |
|
| 256 |
---
|
| 257 |
|
| 258 |
+
## Technical Specifications
|
| 259 |
+
|
| 260 |
+
- **Architecture:** Phi-3 Mini transformer decoder-only (3.8B parameters)
|
| 261 |
+
- **Objective:** Causal language modeling, fine-tuned for sysadmin domain
|
| 262 |
+
- **Context length:** 4096 tokens
|
| 263 |
+
- **Chat format:** Phi-3 template with `<|system|>`, `<|user|>`, `<|assistant|>`, `<|end|>` tokens
|
| 264 |
+
- **Inference runtime:** Ollama (minimum 4GB RAM)
|
| 265 |
+
- **Inference speed (CPU):** ~10–20 tokens/sec
|
| 266 |
+
- **Inference speed (GPU):** ~40–80 tokens/sec
|
| 267 |
+
|
| 268 |
+
**Software stack:**
|
| 269 |
+
- Training: Unsloth + Hugging Face Transformers + PEFT 0.18.1 + PyTorch 2.x
|
| 270 |
+
- Quantization: Unsloth GGUF exporter (llama.cpp based, q4_k_m)
|
| 271 |
+
- Inference: Ollama
|
| 272 |
+
|
| 273 |
+
---
|
| 274 |
+
|
| 275 |
+
## Files in This Repository
|
| 276 |
+
|
| 277 |
+
| File | Size | Description |
|
| 278 |
+
|------|------|-------------|
|
| 279 |
+
| `phi3-sysadmin-Q4_K_M.gguf` | ~2.3 GB | Quantized GGUF model for Ollama / llama.cpp |
|
| 280 |
+
| `Modelfile` | ~0.4 KB | Ollama model configuration |
|
| 281 |
+
| `phi3_finetune.ipynb` | ~60 KB | Full QLoRA training notebook (Google Colab) |
|
| 282 |
+
|
| 283 |
+
---
|
| 284 |
+
|
| 285 |
+
## Related Repositories
|
| 286 |
+
|
| 287 |
+
- **LoRA Adapter + Training Data:** [lalatendu/phi3-sysadmin-lora](https://huggingface.co/lalatendu/phi3-sysadmin-lora)
|
| 288 |
+
- **Base Model:** [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct)
|
| 289 |
+
|
| 290 |
+
---
|
| 291 |
+
|
| 292 |
## Citation
|
| 293 |
|
| 294 |
```bibtex
|
|
|
|
| 301 |
}
|
| 302 |
```
|
| 303 |
|
| 304 |
+
**APA:**
|
| 305 |
+
Swain, L. K. (2026). *phi3-sysadmin-lalatendu: A Fine-tuned Phi-3 Mini GGUF Model for Linux System Administration*. HuggingFace. https://huggingface.co/lalatendu/phi3-sysadmin-lalatendu
|
| 306 |
+
|
| 307 |
---
|
| 308 |
|
| 309 |
+
## Model Card Authors
|
| 310 |
+
|
| 311 |
+
[Lalatendu Keshari Swain](https://lalatendu.info)
|
| 312 |
+
|
| 313 |
## Contact
|
| 314 |
|
| 315 |
| Channel | Link |
|