--- language: - en - id tags: - anerysai - transformer - causal-lm - indonesian - english - pytorch - custom-architecture license: mit datasets: - custom --- # AnerysAI LLM v0.1 A comprehensive Indonesian-English bilingual language model built with custom PyTorch transformer architecture. ## Model Description AnerysAI LLM is a bilingual language model supporting both English and Indonesian languages. It features: - **Chain-of-Thought Reasoning**: Step-by-step reasoning capabilities - **Search-Augmented Generation**: Web search integration for enhanced responses - **Gradient Accumulation Training**: Large effective batch sizes for better convergence - **Custom Architecture**: Optimized transformer implementation ## Model Details - **Model Type**: Custom Transformer-based Causal Language Model - **Architecture**: Multi-head attention with feed-forward networks - **Languages**: English and Indonesian (bilingual support) - **Training Data**: High-quality structured data covering AI, technology, and general knowledge ## Installation ```bash pip install torch transformers huggingface-hub git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git cd AnerysAI-LLM-V0.1 pip install -r requirements.txt ``` ## Usage ### Load and Use the Model ```python from huggingface_hub import hf_hub_download import torch from src.model import TransformerModel from src.config import ModelConfig from src.tokenizer import BaseTokenizer # Download model files model_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "final_model.pt") tokenizer_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "tokenizer.json") config_path = hf_hub_download("Anerysrynz/anerysai-llm-v0.1", "config.json") # Load configuration with open(config_path, 'r') as f: config_data = json.load(f) model_config = config_data["model_config"] # Create model config = ModelConfig( vocab_size=model_config["vocab_size"], max_sequence_length=model_config["max_sequence_length"], embedding_dim=model_config["embedding_dim"], num_heads=model_config["num_heads"], num_layers=model_config["num_layers"], ffn_dim=model_config["ffn_dim"], dropout_rate=model_config["dropout_rate"], ) model = TransformerModel(config) model.load_state_dict(torch.load(model_path, map_location="cpu")) model.eval() # Load tokenizer tokenizer = BaseTokenizer(vocab_size=model_config["vocab_size"]) tokenizer.load(tokenizer_path) # Generate text prompt = "What is artificial intelligence?" tokens = tokenizer.encode(prompt, language="en") # Add generation logic here ``` ### Interactive Usage ```bash # Clone the repository git clone https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1.git cd AnerysAI-LLM-V0.1 # Download model from Hugging Face python -c " from huggingface_hub import snapshot_download snapshot_download(repo_id='Anerysrynz/anerysai-llm-v0.1', local_dir='checkpoints') " # Run inference python inference.py --device cpu --interactive ``` ## Model Configuration | Parameter | Value | |-----------|-------| | Vocabulary Size | 1,081 | | Max Sequence Length | 256 | | Embedding Dimension | 512 | | Number of Heads | 8 | | Number of Layers | 12 | | Feed-forward Dimension | 3,072 | | Dropout Rate | 0.1 | ## Training Details - **Physical Batch Size**: 8 - **Effective Batch Size**: 9,000 (via gradient accumulation) - **Gradient Accumulation Steps**: 1,125 - **Learning Rate**: 0.0018 (auto-scaled) - **Training Approach**: Custom implementation with label smoothing - **Data**: Bilingual structured dataset ## Features ### Chain-of-Thought Reasoning ```python from src.inference import TextGenerator generator = TextGenerator(model, tokenizer) response = generator.think_and_generate( "Why does the sky appear blue?", num_steps=3 ) ``` ### Search-Augmented Generation ```python response = generator.search_and_generate( "What is the capital of France?" ) ``` ## Performance - **Model Size**: ~50M parameters - **Training Time**: Variable (depends on hardware) - **Inference Speed**: Fast on CPU/GPU - **Memory Usage**: Moderate (fits in most GPUs) ## Limitations - Requires custom loading code (not standard HF format) - Model size may be limited compared to larger models - May not generalize as well as models trained on massive datasets - Requires specific dependencies and setup ## Ethical Considerations This model is intended for research and educational purposes. Users should: - Be aware of potential biases in training data - Use the model responsibly - Not deploy in critical applications without thorough testing - Respect copyright and intellectual property laws ## Training Data The model was trained on a curated dataset including: - AI and machine learning concepts - Computer science fundamentals - Technology trends and developments - Bilingual Q&A pairs - Conversational scenarios ## Citation ```bibtex @misc{anerysai-llm-2026, title={AnerysAI LLM v0.1: Bilingual Language Model with Reasoning Capabilities}, author={AneryzRynz Studios}, year={2026}, url={https://huggingface.co/Anerysrynz/anerysai-llm-v0.1} } ``` ## License MIT License - see the LICENSE file in the original repository for details. ## Contact For questions or issues, please visit the [GitHub repository](https://github.com/AneryzRynzStudios/AnerysAI-LLM-V0.1). --- *This model was created for educational and research purposes. Use responsibly.*