AnerysAI LLM v0.1
A comprehensive Indonesian-English bilingual language model built with custom PyTorch transformer architecture.
Model Description
AnerysAI LLM is a bilingual language model supporting both English and Indonesian languages. It features:
- Chain-of-Thought Reasoning: Step-by-step reasoning capabilities
- Search-Augmented Generation: Web search integration for enhanced responses
- Gradient Accumulation Training: Large effective batch sizes for better convergence
- Custom Architecture: Optimized transformer implementation
Model Details
- Model Type: Custom Transformer-based Causal Language Model
- Architecture: Multi-head attention with feed-forward networks
- Languages: English and Indonesian (bilingual support)
- Training Data: High-quality structured data covering AI, technology, and general knowledge
Installation
pip install torch transformers huggingface-hub
git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git
cd AnerysAI-LLM-V0.1
pip install -r requirements.txt
Usage
Load and Use the Model
from huggingface_hub import hf_hub_download
import torch
from src.model import TransformerModel
from src.config import ModelConfig
from src.tokenizer import BaseTokenizer
# Download model files
model_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "final_model.pt")
tokenizer_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "tokenizer.json")
config_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "config.json")
# Load configuration
with open(config_path, 'r') as f:
config_data = json.load(f)
model_config = config_data["model_config"]
# Create model
config = ModelConfig(
vocab_size=model_config["vocab_size"],
max_sequence_length=model_config["max_sequence_length"],
embedding_dim=model_config["embedding_dim"],
num_heads=model_config["num_heads"],
num_layers=model_config["num_layers"],
ffn_dim=model_config["ffn_dim"],
dropout_rate=model_config["dropout_rate"],
)
model = TransformerModel(config)
model.load_state_dict(torch.load(model_path, map_location="cpu"))
model.eval()
# Load tokenizer
tokenizer = BaseTokenizer(vocab_size=model_config["vocab_size"])
tokenizer.load(tokenizer_path)
# Generate text
prompt = "What is artificial intelligence?"
tokens = tokenizer.encode(prompt, language="en")
# Add generation logic here
Interactive Usage
# Clone the repository
git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git
cd AnerysAI-LLM-V0.1
# Download model from Hugging Face
python -c "
from huggingface_hub import snapshot_download
snapshot_download(repo_id='Anerysrynz/anerysai-llm-v0.1', local_dir='checkpoints')
"
# Run inference
python inference.py --device cpu --interactive
Model Configuration
| Parameter | Value |
|---|---|
| Vocabulary Size | 1,081 |
| Max Sequence Length | 256 |
| Embedding Dimension | 512 |
| Number of Heads | 8 |
| Number of Layers | 12 |
| Feed-forward Dimension | 3,072 |
| Dropout Rate | 0.1 |
Training Details
- Physical Batch Size: 8
- Effective Batch Size: 9,000 (via gradient accumulation)
- Gradient Accumulation Steps: 1,125
- Learning Rate: 0.0018 (auto-scaled)
- Training Approach: Custom implementation with label smoothing
- Data: Bilingual structured dataset
Features
Chain-of-Thought Reasoning
from src.inference import TextGenerator
generator = TextGenerator(model, tokenizer)
response = generator.think_and_generate(
"Why does the sky appear blue?",
num_steps=3
)
Search-Augmented Generation
response = generator.search_and_generate(
"What is the capital of France?"
)
Performance
- Model Size: ~50M parameters
- Training Time: Variable (depends on hardware)
- Inference Speed: Fast on CPU/GPU
- Memory Usage: Moderate (fits in most GPUs)
Limitations
- Requires custom loading code (not standard HF format)
- Model size may be limited compared to larger models
- May not generalize as well as models trained on massive datasets
- Requires specific dependencies and setup
Ethical Considerations
This model is intended for research and educational purposes. Users should:
- Be aware of potential biases in training data
- Use the model responsibly
- Not deploy in critical applications without thorough testing
- Respect copyright and intellectual property laws
Training Data
The model was trained on a curated dataset including:
- AI and machine learning concepts
- Computer science fundamentals
- Technology trends and developments
- Bilingual Q&A pairs
- Conversational scenarios
Citation
@misc{anerysai-llm-2026,
title={AnerysAI LLM v0.1: Bilingual Language Model with Reasoning Capabilities},
author={AneryzRynz Studios},
year={2026},
url={https://huggingface.co/Anerysrynz/anerysai-llm-v0.1}
}
License
MIT License - see the LICENSE file in the original repository for details.
Contact
For questions or issues, please visit the GitHub repository.
This model was created for educational and research purposes. Use responsibly.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support