Update README.md

40e067e verified 9 months ago

6.87 kB

	---
	license: llama3.2
	language:
	- en
	- code
	tags:
	- code-generation
	- java
	- llama
	- fine-tuned
	- reflection
	- meta-learning
	pipeline_tag: text-generation
	datasets:
	- Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1
	base_model: meta-llama/Llama-3.2-3B
	widget:
	- text: "You are an expert Java programmer. Generate a complete, working Java method for the given description.\n\nTask: sets the value of the name property.\n\nRequirements:\n- Write a complete Java method\n- Use proper syntax and naming conventions\n- Include return statements where needed\n- Keep it concise but functional\n\n```java\n"
	example_title: "Setter Method"
	- text: "You are an expert Java programmer. Generate a complete, working Java method for the given description.\n\nTask: returns true if the string is empty or null.\n\nRequirements:\n- Write a complete Java method\n- Use proper syntax and naming conventions\n- Include return statements where needed\n- Keep it concise but functional\n\n```java\n"
	example_title: "Null Check Method"
	---

	# LLaMA 3.2 3B - Java Code Generation (Reflection)

	This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) specifically trained for Java method generation using a novel reflection-based meta-learning approach.

	## Model Description

	- Base Model: LLaMA 3.2 3B
	- Training Method: Reflection-based Meta-Learning
	- Task: Java method generation from natural language descriptions
	- Training Data: 100k examples from CodeXGLUE dataset with Claude annotations
	- Language: Java
	- License: LLaMA 3.2 Community License

	## Training Details

	### Dataset
	Trained on [Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1](https://huggingface.co/datasets/Naholav/llama3.2-java-codegen-90sft-10meta-claude-v1):
	- 90,000 SFT examples for standard training
	- 10,000 meta-annotated examples with Claude's error analysis and learning insights
	- Source: CodeXGLUE text-to-code (Java) dataset

	### Reflection-Based Training
	This model uses a unique teacher-student reflection paradigm:
	- Teacher: Claude 4 Sonnet provides error analysis and guidance
	- Student: LLaMA 3.2 3B learns from its mistakes through structured reflection
	- Meta examples include error analysis and learning insights for deeper understanding

	### Training Configuration
	- Epochs: 3
	- Batch Size: 8 × 6 gradient accumulation = 48 effective
	- Learning Rate: 2e-5
	- Max Length: 2048 tokens
	- Precision: float32 (for stability)
	- Optimizer: AdamW
	- Scheduler: Cosine with warmup
	- Early Stopping: Dual tracking (SFT and Meta losses)

	### Hardware
	- GPU: NVIDIA A100 80GB
	- Training Time: ~9 hours
	- Framework: PyTorch 2.0+ with Transformers

	## Usage

	### Installation
	```bash
	pip install transformers torch
	```

	### Quick Start
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	# Load model and tokenizer
	model_name = "Naholav/llama-3.2-3b-100k-codeXGLUE-reflection"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")

	# Prepare prompt
	task_description = "returns the sum of two integers"
	prompt = f"""You are an expert Java programmer. Generate a complete, working Java method for the given description.

	Task: {task_description}

	Requirements:
	- Write a complete Java method
	- Use proper syntax and naming conventions
	- Include return statements where needed
	- Keep it concise but functional

	```java
	"""

	# Generate code
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=150,
	temperature=0.2,
	do_sample=True,
	top_p=0.95,
	pad_token_id=tokenizer.eos_token_id
	)

	generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(generated_code)
	```

	### Expected Output Format
	The model generates Java methods following this pattern:
	```java
	public int sum(int a, int b) {
	return a + b;
	}
	```

	### Testing on Your Own Data

	For local evaluation, you can use:
	- Test dataset from this project: [100 examples](https://github.com/naholav/sft-vs-reflection-llama3-codexglue/blob/main/create%20meta%20dataset%20and%20test%20dataset/codexglue_test_100_samples.json)
	- Original Microsoft test set: [2k examples](https://github.com/microsoft/CodeXGLUE/blob/main/Text-Code/text-to-code/dataset/concode/test.json)

	Important: Remember to clean the natural language descriptions before inference:
	```python
	def clean_nl(nl_description):
	cleaned = nl_description.replace("concode_field_sep", " \| ")
	cleaned = cleaned.replace("concode_elem_sep", ", ")
	return ' '.join(cleaned.split())
	```

	## Performance

	The model was evaluated during training with:
	- Separate tracking of SFT and Meta losses
	- 5 evaluations per epoch
	- Dual early stopping based on both loss types
	- Best checkpoint selected based on average validation loss

	## Reflection Training Methodology

	This model was trained using a novel approach where:
	1. Error Recognition: Model learns to identify common coding mistakes
	2. Pattern Analysis: Understands method signatures and class structures
	3. Knowledge Gaps: Recognizes missing OOP concepts
	4. Improvement Strategy: Internalizes better coding patterns

	Meta examples included structured reflection prompts with:
	- Student's incorrect attempt
	- Teacher's correct implementation
	- Detailed error analysis
	- Learning insights and guidance

	## Comparison with SFT Model

	This is the reflection-based version. For comparison with standard supervised fine-tuning:
	- [SFT Model](https://huggingface.co/Naholav/llama-3.2-3b-100k-codeXGLUE-sft)
	- [GitHub Repository](https://github.com/naholav/sft-vs-reflection-llama3-codexglue) for implementation details

	## Limitations

	- Trained specifically for Java method generation
	- May not generalize well to full classes or other programming languages
	- Best suited for single-method generation tasks
	- Context window limited to 2048 tokens

	## Ethical Considerations

	- The model should not be used to generate malicious code
	- Generated code should be reviewed before use in production
	- Not suitable for generating code that handles sensitive data without proper review

	## Key Differences from SFT Model

	- Training Data: Uses same dataset but processes meta examples differently
	- Learning Paradigm: Teacher-student reflection vs direct imitation
	- Loss Tracking: Dual tracking of SFT and Meta losses
	- Expected Benefit: Better understanding of coding patterns and error avoidance

	## Acknowledgments

	- Meta AI for the LLaMA 3.2 base model
	- Microsoft Research for the CodeXGLUE text-to-code (Java) dataset
	- Anthropic for Claude 4 Sonnet's error analysis and insights
	- Hugging Face for the training infrastructure