--- license: llama3.2 language: - en - code tags: - code-generation - java - llama - fine-tuned - reflection - meta-learning pipeline_tag: text-generation datasets: - Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1 base_model: meta-llama/Llama-3.2-3B widget: - text: "You are an expert Java programmer. Generate a complete, working Java method for the given description.\n\nTask: sets the value of the name property.\n\nRequirements:\n- Write a complete Java method\n- Use proper syntax and naming conventions\n- Include return statements where needed\n- Keep it concise but functional\n\n```java\n" example_title: "Setter Method" - text: "You are an expert Java programmer. Generate a complete, working Java method for the given description.\n\nTask: returns true if the string is empty or null.\n\nRequirements:\n- Write a complete Java method\n- Use proper syntax and naming conventions\n- Include return statements where needed\n- Keep it concise but functional\n\n```java\n" example_title: "Null Check Method" --- # LLaMA 3.2 3B - Java Code Generation (Reflection) This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) specifically trained for Java method generation using a novel reflection-based meta-learning approach. ## Model Description - **Base Model**: LLaMA 3.2 3B - **Training Method**: Reflection-based Meta-Learning - **Task**: Java method generation from natural language descriptions - **Training Data**: 100k examples from CodeXGLUE dataset with Claude annotations - **Language**: Java - **License**: LLaMA 3.2 Community License ## Training Details ### Dataset Trained on [Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1](https://huggingface.co/datasets/Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1): - 90,000 SFT examples for standard training - 10,000 meta-annotated examples with Claude's error analysis and learning insights - Source: CodeXGLUE text-to-code (Java) dataset ### Reflection-Based Training This model uses a unique teacher-student reflection paradigm: - **Teacher**: Claude 4 Sonnet provides error analysis and guidance - **Student**: LLaMA 3.2 3B learns from its mistakes through structured reflection - **Meta examples** include error analysis and learning insights for deeper understanding ### Training Configuration - **Epochs**: 3 - **Batch Size**: 8 × 6 gradient accumulation = 48 effective - **Learning Rate**: 2e-5 - **Max Length**: 2048 tokens - **Precision**: float32 (for stability) - **Optimizer**: AdamW - **Scheduler**: Cosine with warmup - **Early Stopping**: Dual tracking (SFT and Meta losses) ### Hardware - **GPU**: NVIDIA A100 80GB - **Training Time**: ~9 hours - **Framework**: PyTorch 2.0+ with Transformers ## Usage ### Installation ```bash pip install transformers torch ``` ### Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM # Load model and tokenizer model_name = "Naholav/llama-3.2-3b-100k-codeXGLUE-reflection" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto") # Prepare prompt task_description = "returns the sum of two integers" prompt = f"""You are an expert Java programmer. Generate a complete, working Java method for the given description. Task: {task_description} Requirements: - Write a complete Java method - Use proper syntax and naming conventions - Include return statements where needed - Keep it concise but functional ```java """ # Generate code inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=150, temperature=0.2, do_sample=True, top_p=0.95, pad_token_id=tokenizer.eos_token_id ) generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_code) ``` ### Expected Output Format The model generates Java methods following this pattern: ```java public int sum(int a, int b) { return a + b; } ``` ### Testing on Your Own Data For local evaluation, you can use: - **Test dataset from this project**: [100 examples](https://github.com/naholav/sft-vs-reflection-llama3-codexglue/blob/main/create%20meta%20dataset%20and%20test%20dataset/codexglue_test_100_samples.json) - **Original Microsoft test set**: [2k examples](https://github.com/microsoft/CodeXGLUE/blob/main/Text-Code/text-to-code/dataset/concode/test.json) **Important**: Remember to clean the natural language descriptions before inference: ```python def clean_nl(nl_description): cleaned = nl_description.replace("concode_field_sep", " | ") cleaned = cleaned.replace("concode_elem_sep", ", ") return ' '.join(cleaned.split()) ``` ## Performance The model was evaluated during training with: - Separate tracking of SFT and Meta losses - 5 evaluations per epoch - Dual early stopping based on both loss types - Best checkpoint selected based on average validation loss ## Reflection Training Methodology This model was trained using a novel approach where: 1. **Error Recognition**: Model learns to identify common coding mistakes 2. **Pattern Analysis**: Understands method signatures and class structures 3. **Knowledge Gaps**: Recognizes missing OOP concepts 4. **Improvement Strategy**: Internalizes better coding patterns Meta examples included structured reflection prompts with: - Student's incorrect attempt - Teacher's correct implementation - Detailed error analysis - Learning insights and guidance ## Comparison with SFT Model This is the reflection-based version. For comparison with standard supervised fine-tuning: - [SFT Model](https://huggingface.co/Naholav/llama-3.2-3b-100k-codeXGLUE-sft) - [GitHub Repository](https://github.com/naholav/sft-vs-reflection-llama3-codexglue) for implementation details ## Limitations - Trained specifically for Java method generation - May not generalize well to full classes or other programming languages - Best suited for single-method generation tasks - Context window limited to 2048 tokens ## Ethical Considerations - The model should not be used to generate malicious code - Generated code should be reviewed before use in production - Not suitable for generating code that handles sensitive data without proper review ## Key Differences from SFT Model - **Training Data**: Uses same dataset but processes meta examples differently - **Learning Paradigm**: Teacher-student reflection vs direct imitation - **Loss Tracking**: Dual tracking of SFT and Meta losses - **Expected Benefit**: Better understanding of coding patterns and error avoidance ## Acknowledgments - Meta AI for the LLaMA 3.2 base model - Microsoft Research for the CodeXGLUE text-to-code (Java) dataset - Anthropic for Claude 4 Sonnet's error analysis and insights - Hugging Face for the training infrastructure