lujangusface commited on
Commit
29cdd29
·
verified ·
1 Parent(s): 2fcf070

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -115,7 +115,9 @@ EAGLE3 trains a single-layer draft head that predicts the next token using hidde
115
 
116
  *The released model was fine-tuned for 6 additional epochs on 20K regenerated samples from the target model. The fine-tuned accuracy is expected to be equal or higher than these base values.*
117
 
118
- ### Inference Benchmarks (B=1, temp=0, FP8, TP=4)
 
 
119
 
120
  | Dataset | Baseline (tok/s) | EAGLE3 (tok/s) | Speedup |
121
  |---------|-----------------|----------------|---------|
@@ -124,7 +126,18 @@ EAGLE3 trains a single-layer draft head that predicts the next token using hidde
124
  | SWEBench-Verified | 109.6 | 191.8 | **1.75x** |
125
  | Aider | 109.9 | 186.8 | **1.70x** |
126
 
127
- *Config: steps=3, topk=4, draft_tokens=8. All datasets at temp=0 on 8x H200 (TP=4).*
 
 
 
 
 
 
 
 
 
 
 
128
 
129
  ## Model Architecture
130
 
 
115
 
116
  *The released model was fine-tuned for 6 additional epochs on 20K regenerated samples from the target model. The fine-tuned accuracy is expected to be equal or higher than these base values.*
117
 
118
+ ### Inference Benchmarks (B=1, temp=0, TP=4)
119
+
120
+ **With draft_tokens=8 (best B=1 config)**:
121
 
122
  | Dataset | Baseline (tok/s) | EAGLE3 (tok/s) | Speedup |
123
  |---------|-----------------|----------------|---------|
 
126
  | SWEBench-Verified | 109.6 | 191.8 | **1.75x** |
127
  | Aider | 109.9 | 186.8 | **1.70x** |
128
 
129
+ *Config: steps=3, topk=4, draft_tokens=8. 8x H200 (TP=4).*
130
+
131
+ **With draft_tokens=6 (standard config, verified 2026-04-12)**:
132
+
133
+ | Dataset | Baseline (tok/s) | EAGLE3 (tok/s) | Speedup |
134
+ |---------|-----------------|----------------|---------|
135
+ | HumanEval | 109.6 | 158.0 | **1.44x** |
136
+ | Terminal-Bench | 108.9 | 150.2 | **1.38x** |
137
+ | MT-Bench | 109.0 | 143.6 | **1.32x** |
138
+ | SWEBench-Verified | 109.1 | 116.5 | **1.07x** |
139
+
140
+ *Config: steps=3, topk=4, draft_tokens=6. 4x H200 (TP=4). Server-side Prometheus metrics.*
141
 
142
  ## Model Architecture
143