Instructions to use cognica/Cognica-BP-v1.0-1.3B-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use cognica/Cognica-BP-v1.0-1.3B-base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="cognica/Cognica-BP-v1.0-1.3B-base", trust_remote_code=True)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("cognica/Cognica-BP-v1.0-1.3B-base", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use cognica/Cognica-BP-v1.0-1.3B-base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "cognica/Cognica-BP-v1.0-1.3B-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cognica/Cognica-BP-v1.0-1.3B-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/cognica/Cognica-BP-v1.0-1.3B-base

SGLang

How to use cognica/Cognica-BP-v1.0-1.3B-base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "cognica/Cognica-BP-v1.0-1.3B-base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cognica/Cognica-BP-v1.0-1.3B-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "cognica/Cognica-BP-v1.0-1.3B-base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cognica/Cognica-BP-v1.0-1.3B-base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use cognica/Cognica-BP-v1.0-1.3B-base with Docker Model Runner:
```
docker model run hf.co/cognica/Cognica-BP-v1.0-1.3B-base
```

jaepil commited on Apr 21

Commit

dc5983c

verified ·

1 Parent(s): 37ef779

Upload tokenization_cognica_poe.py with huggingface_hub

Browse files

Files changed (1) hide show

tokenization_cognica_poe.py +212 -0

tokenization_cognica_poe.py ADDED Viewed

	@@ -0,0 +1,212 @@

+"""Cognica-PoE tokenizer: HuggingFace `PreTrainedTokenizer` wrapper around the
+nanochat tiktoken BPE. Loaded by `AutoTokenizer.from_pretrained(...,
+trust_remote_code=True)`.
+The underlying encoding is pickled as `tokenizer.pkl` (a `tiktoken.Encoding`
+object). Special tokens are assigned to their tiktoken ids so the HF tokenize-
+around-specials flow and raw tiktoken encoding produce identical id sequences.
+"""
+import os
+import pickle
+from typing import Dict, List, Optional, Tuple
+import tiktoken
+from transformers import PreTrainedTokenizer
+from transformers.tokenization_utils import AddedToken
+SPECIAL_TOKENS = [
+    "<|bos|>",
+    "<|user_start|>",
+    "<|user_end|>",
+    "<|assistant_start|>",
+    "<|assistant_end|>",
+    "<|python_start|>",
+    "<|python_end|>",
+    "<|output_start|>",
+    "<|output_end|>",
+]
+class CognicaPoETokenizer(PreTrainedTokenizer):
+    """BPE tokenizer backed by a pickled `tiktoken.Encoding` (nanochat format)."""
+    vocab_files_names = {"vocab_file": "tokenizer.pkl"}
+    model_input_names = ["input_ids", "attention_mask"]
+    def __init__(
+        self,
+        vocab_file: str,
+        bos_token: str = "<|bos|>",
+        eos_token: str = "<|bos|>",
+        pad_token: Optional[str] = None,
+        unk_token: Optional[str] = None,
+        **kwargs,
+    ):
+        if not os.path.exists(vocab_file):
+            raise ValueError(
+                f"tokenizer.pkl not found at {vocab_file}. "
+                "Make sure it was downloaded from the model repo."
+            )
+        with open(vocab_file, "rb") as f:
+            enc = pickle.load(f)
+        if not isinstance(enc, tiktoken.Encoding):
+            raise TypeError(
+                f"Expected tiktoken.Encoding in {vocab_file}, got {type(enc).__name__}"
+            )
+        self.enc = enc
+        self._vocab_file = vocab_file
+        # Respect a pre-built added_tokens_decoder from tokenizer_config.json;
+        # otherwise synthesize one from the tiktoken special-tokens set.
+        added_decoder = kwargs.pop("added_tokens_decoder", None)
+        if not added_decoder:
+            added_decoder = {}
+            for tok in SPECIAL_TOKENS:
+                try:
+                    tid = enc.encode_single_token(tok)
+                except (KeyError, ValueError):
+                    continue
+                added_decoder[tid] = AddedToken(
+                    tok,
+                    lstrip=False,
+                    rstrip=False,
+                    single_word=False,
+                    normalized=False,
+                    special=True,
+                )
+        super().__init__(
+            bos_token=bos_token,
+            eos_token=eos_token,
+            pad_token=pad_token,
+            unk_token=unk_token,
+            added_tokens_decoder=added_decoder,
+            **kwargs,
+        )
+    @property
+    def vocab_size(self) -> int:
+        return self.enc.n_vocab
+    def get_vocab(self) -> Dict[str, int]:
+        vocab: Dict[str, int] = {}
+        for tid in range(self.enc.n_vocab):
+            try:
+                raw = self.enc.decode_single_token_bytes(tid)
+                token = raw.decode("utf-8", errors="replace")
+            except Exception:
+                token = f"<id_{tid}>"
+            vocab[token] = tid
+        for tok in SPECIAL_TOKENS:
+            try:
+                vocab[tok] = self.enc.encode_single_token(tok)
+            except (KeyError, ValueError):
+                pass
+        vocab.update(self.added_tokens_encoder)
+        return vocab
+    def _tokenize(self, text: str, **kwargs) -> List[str]:
+        # Base class splits around special tokens and calls _tokenize on the
+        # non-special chunks. We use tiktoken's ordinary encoder (which does
+        # not recognize specials), and return ids as strings.
+        ids = self.enc.encode_ordinary(text)
+        return [str(i) for i in ids]
+    def _convert_token_to_id(self, token: str) -> int:
+        try:
+            return int(token)
+        except ValueError:
+            try:
+                return self.enc.encode_single_token(token)
+            except (KeyError, ValueError) as e:
+                raise ValueError(f"Unknown token: {token!r}") from e
+    def _convert_id_to_token(self, index: int) -> str:
+        try:
+            raw = self.enc.decode_single_token_bytes(index)
+            return raw.decode("utf-8", errors="replace")
+        except Exception:
+            return str(index)
+    def convert_tokens_to_string(self, tokens: List[str]) -> str:
+        # The token list here is a mix of: (a) stringified integer ids from
+        # `_tokenize`, (b) UTF-8 text chunks from `_convert_id_to_token`, and
+        # (c) raw special-token literals from the added-tokens splitter. We
+        # can't disambiguate (a) from (b) by `int(tok)` alone — a token whose
+        # decoded UTF-8 text happens to be numeric (e.g. "17") would otherwise
+        # be mis-cast to the *id* 17 and decode to byte 0x11. Resolve each
+        # entry carefully and fall back to a literal UTF-8 re-encode.
+        ids: List[int] = []
+        for tok in tokens:
+            if tok in self.added_tokens_encoder:
+                ids.append(self.added_tokens_encoder[tok])
+                continue
+            try:
+                tid = int(tok)
+            except ValueError:
+                tid = None
+            if tid is not None and 0 <= tid < self.enc.n_vocab:
+                raw = self.enc.decode_single_token_bytes(tid)
+                if raw.decode("utf-8", errors="replace") == tok:
+                    ids.append(tid)
+                    continue
+            # Treat as a literal text fragment from `_convert_id_to_token`.
+            ids.extend(self.enc.encode_ordinary(tok))
+        return self.enc.decode(ids)
+    def _decode(
+        self,
+        token_ids,
+        skip_special_tokens: bool = False,
+        clean_up_tokenization_spaces: Optional[bool] = None,
+        spaces_between_special_tokens: bool = True,
+        **kwargs,
+    ) -> str:
+        # Bypass HF's token-string round-trip: decoding ids through
+        # `_convert_id_to_token` -> `convert_tokens_to_string` loses byte
+        # boundaries for multi-byte UTF-8 tokens and is fragile around
+        # numeric-looking tokens. Go directly through tiktoken.
+        if isinstance(token_ids, int):
+            token_ids = [token_ids]
+        elif hasattr(token_ids, "tolist"):
+            token_ids = token_ids.tolist()
+        token_ids = [int(t) for t in token_ids]
+        special_ids = set(self.all_special_ids)
+        if skip_special_tokens:
+            return self.enc.decode([t for t in token_ids if t not in special_ids])
+        if not special_ids.intersection(token_ids):
+            return self.enc.decode(token_ids)
+        # Emit specials literally (between non-special byte runs).
+        out_parts: List[str] = []
+        run: List[int] = []
+        id_to_special = {
+            self.added_tokens_encoder[t]: t for t in self.added_tokens_encoder
+        }
+        for tid in token_ids:
+            if tid in special_ids:
+                if run:
+                    out_parts.append(self.enc.decode(run))
+                    run = []
+                out_parts.append(id_to_special.get(tid, f"<|id_{tid}|>"))
+            else:
+                run.append(tid)
+        if run:
+            out_parts.append(self.enc.decode(run))
+        return "".join(out_parts)
+    def save_vocabulary(
+        self,
+        save_directory: str,
+        filename_prefix: Optional[str] = None,
+    ) -> Tuple[str, ...]:
+        os.makedirs(save_directory, exist_ok=True)
+        prefix = f"{filename_prefix}-" if filename_prefix else ""
+        out = os.path.join(save_directory, f"{prefix}tokenizer.pkl")
+        with open(out, "wb") as f:
+            pickle.dump(self.enc, f)
+        return (out,)
+__all__ = ["CognicaPoETokenizer"]