Purple Squirrel AI β Models, Papers & Data
Collection
9 models, 3 papers, 3 datasets. Distributed AI, GPU video, multichain DeFi, Solana wallets. GGUF quants + LoRA + 1.3K training pairs. MIT. β’ 12 items β’ Updated
LoRA adapter weights for Purple Squirrel R1 Multichain, fine-tuned on 58 conference sessions from Wrapped Events covering cross-chain protocols, DeFi infrastructure, and Web3 technology.
Use these adapters to apply the multichain fine-tuning to the base model yourself, or continue training with your own data.
| Property | Value |
|---|---|
| Base Model | DeepSeek-R1-Distill-Llama-8B (4-bit) |
| Method | LoRA (Low-Rank Adaptation) |
| Rank | 8 |
| Scale | 20.0 |
| Dropout | 0.0 |
| LoRA Layers | 4 |
| Trainable Params | 2.621M / 8,030M (0.033%) |
| Framework | MLX-LM 0.29.1 |
| Adapter Size | ~10 MB |
| Hardware | Apple M-series (16GB RAM) |
| Peak Memory | 6.184 GB |
framework: mlx-lm 0.29.1
method: LoRA
lora_layers: 4
lora_rank: 8
learning_rate: 1e-5
batch_size: 1
iterations: 200
max_seq_length: 1024
grad_checkpoint: true
save_every: 100
seed: 42
| Iteration | Train Loss | Val Loss | Improvement |
|---|---|---|---|
| 0 | β | 3.799 | baseline |
| 50 | 3.202 | 3.241 | -14.7% |
| 100 | 3.056 | 3.126 | -17.7% |
| 150 | 3.140 | 3.098 | -18.5% |
| 200 | 3.083 | 3.091 | -18.6% |
βββ adapters.safetensors # Final adapter weights (iteration 200)
βββ adapter_config.json # Training config & hyperparameters
βββ checkpoints/
βββ 0000100_adapters.safetensors # Checkpoint at iteration 100
βββ 0000200_adapters.safetensors # Checkpoint at iteration 200
from mlx_lm import load, generate
# Load base model with LoRA adapters
model, tokenizer = load(
"mlx-community/DeepSeek-R1-Distill-Llama-8B-4bit",
adapter_path="purplesquirrelnetworks/purple-squirrel-r1-multichain-lora"
)
messages = [
{"role": "system", "content": "You are a multichain ecosystem expert."},
{"role": "user", "content": "How does Wormhole enable cross-chain messaging?"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=500)
print(response)
mlx_lm.lora \
--model mlx-community/DeepSeek-R1-Distill-Llama-8B-4bit \
--resume-adapter-file purplesquirrelnetworks/purple-squirrel-r1-multichain-lora/adapters.safetensors \
--data /path/to/your/data \
--iters 100
Protocols covered: Wormhole, LayerZero, ZetaChain, Compose Network, Aptos, Monad, NEAR, Polygon, Stacks, Aurora, Pyth, 1inch, Beefy, Relay, Pipe Network, DoubleZero, BitcoinOS.
Topics: cross-chain messaging, L1/L2 ecosystems, DeFi infrastructure, onchain AI agents, RWA tokenization, account abstraction, sustainable yield.
| Resource | Link |
|---|---|
| Full Fused Model | purple-squirrel-r1-multichain |
| Training Data | multichain-day-training |
| Base Model (R1) | purple-squirrel-r1 |
| GGUF Version | purple-squirrel-r1-gguf |
| AIDP Neural Cloud Paper | aidp-neural-cloud-paper |
| Full Collection | Purple Squirrel AI |
@misc{purplesquirrel-r1-multichain-lora-2025,
title={Purple Squirrel R1 Multichain LoRA Adapters},
author={Karsten, Matthew},
year={2025},
publisher={Purple Squirrel Media},
howpublished={\url{https://huggingface.co/purplesquirrelnetworks/purple-squirrel-r1-multichain-lora}},
note={MLX LoRA adapters for DeepSeek-R1-Distill-Llama-8B, fine-tuned on Wrapped Events multichain conference data}
}
MIT
Quantized
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B