Instructions to use ibivibiv/chimera-120b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ibivibiv/chimera-120b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ibivibiv/chimera-120b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ibivibiv/chimera-120b")
model = AutoModelForCausalLM.from_pretrained("ibivibiv/chimera-120b", device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ibivibiv/chimera-120b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ibivibiv/chimera-120b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ibivibiv/chimera-120b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/ibivibiv/chimera-120b

SGLang

How to use ibivibiv/chimera-120b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ibivibiv/chimera-120b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ibivibiv/chimera-120b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ibivibiv/chimera-120b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ibivibiv/chimera-120b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use ibivibiv/chimera-120b with Docker Model Runner:
```
docker model run hf.co/ibivibiv/chimera-120b
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Chimera 120b

An auto-regressive causal LM created by combining 3x finetuned models into one via passthrough merging slices in a stacked order. This model is a sliced passthrough merge of Sheep Duck, Xwin, and Sythia. I wanted to make Sheep Duck part of Giant Macaroni but the Marconi and Sheep Duck models don't line up (I think because of the rotary embedding). Honestly, without the fine tuning yet, I think this model might be slightly better at my logic and reason tests. I'll try to update more here as I go and release more versions as I can fine tune these more.

Prompting Format

Both Vicuna and Alpaca will work, but due the final layers belonging primarily to Xwin.

Benchmarks

Coming soon.

Acknowledgements

@chargoddard - mergekit. @migtissera - for Tess-XL which inspired me to believe that open models can compete on logic tasks with the big commercial models. @alpindale - for Goliath-120B that started this crazy endeavor for us all @nsfwthrowitaway69 - for sharing the merge config for Venus-120B and getting me off the starting block with some questions on mergekit and tokenizers

Keep it open and keep sharing everyone! With Mixtral and MOE changes to mergekit coupled with these larger merged models? I think the sky is the limit for us all. I can only imagine what will happen if we took a group of these 120 models, fin tuned them each a bit and applied the MOE Mixtral merge method to them? I would also point out that if a clever VC came along and funded that work? You have the people you need right here on huggingface and all they need is the equipment to do it on.

Downloads last month: 3

Safetensors

Model size

120B params

Tensor type

F32

Model tree for ibivibiv/chimera-120b

Quantizations

2 models

Collection including ibivibiv/chimera-120b

Experimental Models

Collection

These are either first attempts at a specific goal or just a shot in the dark to try something new • 11 items • Updated Feb 29, 2024