---
language:
- en
- id
tags:
- anerysai
- transformer
- causal-lm
- indonesian
- english
- pytorch
- custom-architecture
license: mit
datasets:
- custom
---

# AnerysAI LLM v0.1

A comprehensive Indonesian-English bilingual language model built with custom PyTorch transformer architecture.

## Model Description

AnerysAI LLM is a bilingual language model supporting both English and Indonesian languages. It features:
- **Chain-of-Thought Reasoning**: Step-by-step reasoning capabilities
- **Search-Augmented Generation**: Web search integration for enhanced responses
- **Gradient Accumulation Training**: Large effective batch sizes for better convergence
- **Custom Architecture**: Optimized transformer implementation

## Model Details

- **Model Type**: Custom Transformer-based Causal Language Model
- **Architecture**: Multi-head attention with feed-forward networks
- **Languages**: English and Indonesian (bilingual support)
- **Training Data**: High-quality structured data covering AI, technology, and general knowledge

## Installation

```bash
pip install torch transformers huggingface-hub
git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git
cd AnerysAI-LLM-V0.1
pip install -r requirements.txt
```

## Usage

### Load and Use the Model

```python
from huggingface_hub import hf_hub_download
import torch
from src.model import TransformerModel
from src.config import ModelConfig
from src.tokenizer import BaseTokenizer

# Download model files
model_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "final_model.pt")
tokenizer_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "tokenizer.json")
config_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "config.json")

# Load configuration
with open(config_path, 'r') as f:
    config_data = json.load(f)

model_config = config_data["model_config"]

# Create model
config = ModelConfig(
    vocab_size=model_config["vocab_size"],
    max_sequence_length=model_config["max_sequence_length"],
    embedding_dim=model_config["embedding_dim"],
    num_heads=model_config["num_heads"],
    num_layers=model_config["num_layers"],
    ffn_dim=model_config["ffn_dim"],
    dropout_rate=model_config["dropout_rate"],
)

model = TransformerModel(config)
model.load_state_dict(torch.load(model_path, map_location="cpu"))
model.eval()

# Load tokenizer
tokenizer = BaseTokenizer(vocab_size=model_config["vocab_size"])
tokenizer.load(tokenizer_path)

# Generate text
prompt = "What is artificial intelligence?"
tokens = tokenizer.encode(prompt, language="en")
# Add generation logic here
```

### Interactive Usage

```bash
# Clone the repository
git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git
cd AnerysAI-LLM-V0.1

# Download model from Hugging Face
python -c "
from huggingface_hub import snapshot_download
snapshot_download(repo_id='Anerysrynz/anerysai-llm-v0.1', local_dir='checkpoints')
"

# Run inference
python inference.py --device cpu --interactive
```

## Model Configuration

| Parameter | Value |
|-----------|-------|
| Vocabulary Size | 1,081 |
| Max Sequence Length | 256 |
| Embedding Dimension | 512 |
| Number of Heads | 8 |
| Number of Layers | 12 |
| Feed-forward Dimension | 3,072 |
| Dropout Rate | 0.1 |

## Training Details

- **Physical Batch Size**: 8
- **Effective Batch Size**: 9,000 (via gradient accumulation)
- **Gradient Accumulation Steps**: 1,125
- **Learning Rate**: 0.0018 (auto-scaled)
- **Training Approach**: Custom implementation with label smoothing
- **Data**: Bilingual structured dataset

## Features

### Chain-of-Thought Reasoning
```python
from src.inference import TextGenerator

generator = TextGenerator(model, tokenizer)
response = generator.think_and_generate(
    "Why does the sky appear blue?",
    num_steps=3
)
```

### Search-Augmented Generation
```python
response = generator.search_and_generate(
    "What is the capital of France?"
)
```

## Performance

- **Model Size**: ~50M parameters
- **Training Time**: Variable (depends on hardware)
- **Inference Speed**: Fast on CPU/GPU
- **Memory Usage**: Moderate (fits in most GPUs)

## Limitations

- Requires custom loading code (not standard HF format)
- Model size may be limited compared to larger models
- May not generalize as well as models trained on massive datasets
- Requires specific dependencies and setup

## Ethical Considerations

This model is intended for research and educational purposes. Users should:
- Be aware of potential biases in training data
- Use the model responsibly
- Not deploy in critical applications without thorough testing
- Respect copyright and intellectual property laws

## Training Data

The model was trained on a curated dataset including:
- AI and machine learning concepts
- Computer science fundamentals
- Technology trends and developments
- Bilingual Q&A pairs
- Conversational scenarios

## Citation

```bibtex
@misc{anerysai-llm-2026,
  title={AnerysAI LLM v0.1: Bilingual Language Model with Reasoning Capabilities},
  author={AneryzRynz Studios},
  year={2026},
  url={https://huggingface.co/Anerysrynz/anerysai-llm-v0.1}
}
```

## License

MIT License - see the LICENSE file in the original repository for details.

## Contact

For questions or issues, please visit the [GitHub repository](https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1).

---

*This model was created for educational and research purposes. Use responsibly.*