AnerysAI LLM v0.1

A comprehensive Indonesian-English bilingual language model built with custom PyTorch transformer architecture.

Model Description

AnerysAI LLM is a bilingual language model supporting both English and Indonesian languages. It features:

  • Chain-of-Thought Reasoning: Step-by-step reasoning capabilities
  • Search-Augmented Generation: Web search integration for enhanced responses
  • Gradient Accumulation Training: Large effective batch sizes for better convergence
  • Custom Architecture: Optimized transformer implementation

Model Details

  • Model Type: Custom Transformer-based Causal Language Model
  • Architecture: Multi-head attention with feed-forward networks
  • Languages: English and Indonesian (bilingual support)
  • Training Data: High-quality structured data covering AI, technology, and general knowledge

Installation

pip install torch transformers huggingface-hub
git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git
cd AnerysAI-LLM-V0.1
pip install -r requirements.txt

Usage

Load and Use the Model

from huggingface_hub import hf_hub_download
import torch
from src.model import TransformerModel
from src.config import ModelConfig
from src.tokenizer import BaseTokenizer

# Download model files
model_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "final_model.pt")
tokenizer_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "tokenizer.json")
config_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "config.json")

# Load configuration
with open(config_path, 'r') as f:
    config_data = json.load(f)

model_config = config_data["model_config"]

# Create model
config = ModelConfig(
    vocab_size=model_config["vocab_size"],
    max_sequence_length=model_config["max_sequence_length"],
    embedding_dim=model_config["embedding_dim"],
    num_heads=model_config["num_heads"],
    num_layers=model_config["num_layers"],
    ffn_dim=model_config["ffn_dim"],
    dropout_rate=model_config["dropout_rate"],
)

model = TransformerModel(config)
model.load_state_dict(torch.load(model_path, map_location="cpu"))
model.eval()

# Load tokenizer
tokenizer = BaseTokenizer(vocab_size=model_config["vocab_size"])
tokenizer.load(tokenizer_path)

# Generate text
prompt = "What is artificial intelligence?"
tokens = tokenizer.encode(prompt, language="en")
# Add generation logic here

Interactive Usage

# Clone the repository
git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git
cd AnerysAI-LLM-V0.1

# Download model from Hugging Face
python -c "
from huggingface_hub import snapshot_download
snapshot_download(repo_id='Anerysrynz/anerysai-llm-v0.1', local_dir='checkpoints')
"

# Run inference
python inference.py --device cpu --interactive

Model Configuration

Parameter Value
Vocabulary Size 1,081
Max Sequence Length 256
Embedding Dimension 512
Number of Heads 8
Number of Layers 12
Feed-forward Dimension 3,072
Dropout Rate 0.1

Training Details

  • Physical Batch Size: 8
  • Effective Batch Size: 9,000 (via gradient accumulation)
  • Gradient Accumulation Steps: 1,125
  • Learning Rate: 0.0018 (auto-scaled)
  • Training Approach: Custom implementation with label smoothing
  • Data: Bilingual structured dataset

Features

Chain-of-Thought Reasoning

from src.inference import TextGenerator

generator = TextGenerator(model, tokenizer)
response = generator.think_and_generate(
    "Why does the sky appear blue?",
    num_steps=3
)

Search-Augmented Generation

response = generator.search_and_generate(
    "What is the capital of France?"
)

Performance

  • Model Size: ~50M parameters
  • Training Time: Variable (depends on hardware)
  • Inference Speed: Fast on CPU/GPU
  • Memory Usage: Moderate (fits in most GPUs)

Limitations

  • Requires custom loading code (not standard HF format)
  • Model size may be limited compared to larger models
  • May not generalize as well as models trained on massive datasets
  • Requires specific dependencies and setup

Ethical Considerations

This model is intended for research and educational purposes. Users should:

  • Be aware of potential biases in training data
  • Use the model responsibly
  • Not deploy in critical applications without thorough testing
  • Respect copyright and intellectual property laws

Training Data

The model was trained on a curated dataset including:

  • AI and machine learning concepts
  • Computer science fundamentals
  • Technology trends and developments
  • Bilingual Q&A pairs
  • Conversational scenarios

Citation

@misc{anerysai-llm-2026,
  title={AnerysAI LLM v0.1: Bilingual Language Model with Reasoning Capabilities},
  author={AneryzRynz Studios},
  year={2026},
  url={https://huggingface.co/Anerysrynz/anerysai-llm-v0.1}
}

License

MIT License - see the LICENSE file in the original repository for details.

Contact

For questions or issues, please visit the GitHub repository.


This model was created for educational and research purposes. Use responsibly.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support