| | --- |
| | license: llama3.2 |
| | language: |
| | - en |
| | - code |
| | tags: |
| | - code-generation |
| | - java |
| | - llama |
| | - fine-tuned |
| | - reflection |
| | - meta-learning |
| | pipeline_tag: text-generation |
| | datasets: |
| | - Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1 |
| | base_model: meta-llama/Llama-3.2-3B |
| | widget: |
| | - text: "You are an expert Java programmer. Generate a complete, working Java method for the given description.\n\nTask: sets the value of the name property.\n\nRequirements:\n- Write a complete Java method\n- Use proper syntax and naming conventions\n- Include return statements where needed\n- Keep it concise but functional\n\n```java\n" |
| | example_title: "Setter Method" |
| | - text: "You are an expert Java programmer. Generate a complete, working Java method for the given description.\n\nTask: returns true if the string is empty or null.\n\nRequirements:\n- Write a complete Java method\n- Use proper syntax and naming conventions\n- Include return statements where needed\n- Keep it concise but functional\n\n```java\n" |
| | example_title: "Null Check Method" |
| | --- |
| | |
| | # LLaMA 3.2 3B - Java Code Generation (Reflection) |
| |
|
| | This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) specifically trained for Java method generation using a novel reflection-based meta-learning approach. |
| |
|
| | ## Model Description |
| |
|
| | - **Base Model**: LLaMA 3.2 3B |
| | - **Training Method**: Reflection-based Meta-Learning |
| | - **Task**: Java method generation from natural language descriptions |
| | - **Training Data**: 100k examples from CodeXGLUE dataset with Claude annotations |
| | - **Language**: Java |
| | - **License**: LLaMA 3.2 Community License |
| |
|
| | ## Training Details |
| |
|
| | ### Dataset |
| | Trained on [Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1](https://huggingface.co/datasets/Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1): |
| | - 90,000 SFT examples for standard training |
| | - 10,000 meta-annotated examples with Claude's error analysis and learning insights |
| | - Source: CodeXGLUE text-to-code (Java) dataset |
| |
|
| | ### Reflection-Based Training |
| | This model uses a unique teacher-student reflection paradigm: |
| | - **Teacher**: Claude 4 Sonnet provides error analysis and guidance |
| | - **Student**: LLaMA 3.2 3B learns from its mistakes through structured reflection |
| | - **Meta examples** include error analysis and learning insights for deeper understanding |
| |
|
| | ### Training Configuration |
| | - **Epochs**: 3 |
| | - **Batch Size**: 8 × 6 gradient accumulation = 48 effective |
| | - **Learning Rate**: 2e-5 |
| | - **Max Length**: 2048 tokens |
| | - **Precision**: float32 (for stability) |
| | - **Optimizer**: AdamW |
| | - **Scheduler**: Cosine with warmup |
| | - **Early Stopping**: Dual tracking (SFT and Meta losses) |
| |
|
| | ### Hardware |
| | - **GPU**: NVIDIA A100 80GB |
| | - **Training Time**: ~9 hours |
| | - **Framework**: PyTorch 2.0+ with Transformers |
| |
|
| | ## Usage |
| |
|
| | ### Installation |
| | ```bash |
| | pip install transformers torch |
| | ``` |
| |
|
| | ### Quick Start |
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | # Load model and tokenizer |
| | model_name = "Naholav/llama-3.2-3b-100k-codeXGLUE-reflection" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto") |
| | |
| | # Prepare prompt |
| | task_description = "returns the sum of two integers" |
| | prompt = f"""You are an expert Java programmer. Generate a complete, working Java method for the given description. |
| | |
| | Task: {task_description} |
| | |
| | Requirements: |
| | - Write a complete Java method |
| | - Use proper syntax and naming conventions |
| | - Include return statements where needed |
| | - Keep it concise but functional |
| | |
| | ```java |
| | """ |
| |
|
| | # Generate code |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=150, |
| | temperature=0.2, |
| | do_sample=True, |
| | top_p=0.95, |
| | pad_token_id=tokenizer.eos_token_id |
| | ) |
| | |
| | generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| | print(generated_code) |
| | ``` |
| | |
| | ### Expected Output Format |
| | The model generates Java methods following this pattern: |
| | ```java |
| | public int sum(int a, int b) { |
| | return a + b; |
| | } |
| | ``` |
| | |
| | ### Testing on Your Own Data |
| | |
| | For local evaluation, you can use: |
| | - **Test dataset from this project**: [100 examples](https://github.com/naholav/sft-vs-reflection-llama3-codexglue/blob/main/create%20meta%20dataset%20and%20test%20dataset/codexglue_test_100_samples.json) |
| | - **Original Microsoft test set**: [2k examples](https://github.com/microsoft/CodeXGLUE/blob/main/Text-Code/text-to-code/dataset/concode/test.json) |
| | |
| | **Important**: Remember to clean the natural language descriptions before inference: |
| | ```python |
| | def clean_nl(nl_description): |
| | cleaned = nl_description.replace("concode_field_sep", " | ") |
| | cleaned = cleaned.replace("concode_elem_sep", ", ") |
| | return ' '.join(cleaned.split()) |
| | ``` |
| | |
| | ## Performance |
| | |
| | The model was evaluated during training with: |
| | - Separate tracking of SFT and Meta losses |
| | - 5 evaluations per epoch |
| | - Dual early stopping based on both loss types |
| | - Best checkpoint selected based on average validation loss |
| | |
| | ## Reflection Training Methodology |
| | |
| | This model was trained using a novel approach where: |
| | 1. **Error Recognition**: Model learns to identify common coding mistakes |
| | 2. **Pattern Analysis**: Understands method signatures and class structures |
| | 3. **Knowledge Gaps**: Recognizes missing OOP concepts |
| | 4. **Improvement Strategy**: Internalizes better coding patterns |
| | |
| | Meta examples included structured reflection prompts with: |
| | - Student's incorrect attempt |
| | - Teacher's correct implementation |
| | - Detailed error analysis |
| | - Learning insights and guidance |
| | |
| | ## Comparison with SFT Model |
| | |
| | This is the reflection-based version. For comparison with standard supervised fine-tuning: |
| | - [SFT Model](https://huggingface.co/Naholav/llama-3.2-3b-100k-codeXGLUE-sft) |
| | - [GitHub Repository](https://github.com/naholav/sft-vs-reflection-llama3-codexglue) for implementation details |
| | |
| | ## Limitations |
| | |
| | - Trained specifically for Java method generation |
| | - May not generalize well to full classes or other programming languages |
| | - Best suited for single-method generation tasks |
| | - Context window limited to 2048 tokens |
| | |
| | ## Ethical Considerations |
| | |
| | - The model should not be used to generate malicious code |
| | - Generated code should be reviewed before use in production |
| | - Not suitable for generating code that handles sensitive data without proper review |
| | |
| | ## Key Differences from SFT Model |
| | |
| | - **Training Data**: Uses same dataset but processes meta examples differently |
| | - **Learning Paradigm**: Teacher-student reflection vs direct imitation |
| | - **Loss Tracking**: Dual tracking of SFT and Meta losses |
| | - **Expected Benefit**: Better understanding of coding patterns and error avoidance |
| | |
| | ## Acknowledgments |
| | |
| | - Meta AI for the LLaMA 3.2 base model |
| | - Microsoft Research for the CodeXGLUE text-to-code (Java) dataset |
| | - Anthropic for Claude 4 Sonnet's error analysis and insights |
| | - Hugging Face for the training infrastructure |