Add converted tokenizer (no trust_remote_code needed)
#12
by
ArthurZ
HF Staff
- opened
Tokenizer Conversion
This PR adds a converted tokenizer that works without trust_remote_code=True.
Conversion Details
- Converted using:
python scripts/convert_tokenizer.py moonshotai/Kimi-Dev-72B --push-to-hub - Original tokenizer type:
Qwen2Tokenizer - Converted tokenizer type:
Qwen2Tokenizer
Validation Results
- Tested on XNLI dataset (500 samples)
- All samples match 1-1 β
Usage
from transformers import AutoTokenizer
# Now works without trust_remote_code=True
tokenizer = AutoTokenizer.from_pretrained("moonshotai/Kimi-Dev-72B")
Converted with transformers tokenizer conversion script