alexmarques commited on
Commit
0316f72
·
verified ·
1 Parent(s): 35c3e29

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -159,6 +159,37 @@ This version of the lm-evaluation-harness includes versions of MMLU, ARC-Challen
159
  <td><strong>Recovery</strong>
160
  </td>
161
  </tr>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
162
  <tr>
163
  <td>GSM-8K (CoT, 8-shot, strict-match)
164
  </td>
 
159
  <td><strong>Recovery</strong>
160
  </td>
161
  </tr>
162
+ <tr>
163
+ <td>MMLU (CoT, 0-shot)
164
+ </td>
165
+ <td>88.11
166
+ </td>
167
+ <td>88.23
168
+ </td>
169
+ <td>100.1%
170
+ </td>
171
+ </tr>
172
+ <tr>
173
+ <td>ARC Challenge (0-shot)
174
+ </td>
175
+ <td>94.97
176
+ </td>
177
+ <td>94.88
178
+ </td>
179
+ <td>99.9%
180
+ </td>
181
+ </tr>
182
+ <tr>
183
+ <tr>
184
+ <td>GSM-8K (CoT, 8-shot, strict-match)
185
+ </td>
186
+ <td>96.44
187
+ </td>
188
+ <td>96.13
189
+ </td>
190
+ <td>99.7%
191
+ </td>
192
+ </tr>
193
  <tr>
194
  <td>GSM-8K (CoT, 8-shot, strict-match)
195
  </td>