LLaMA 3.2 3B - Java Code Generation (SFT)

This model is a fine-tuned version of meta-llama/Llama-3.2-3B specifically trained for Java method generation using supervised fine-tuning (SFT).

Model Description

Base Model: LLaMA 3.2 3B
Training Method: Supervised Fine-Tuning (SFT)
Task: Java method generation from natural language descriptions
Training Data: 100k examples from CodeXGLUE dataset
Language: Java
License: LLaMA 3.2 Community License

Training Details

Dataset

Trained on [https://github.com/microsoft/CodeXGLUE/blob/main/Text-Code/text-to-code/dataset/concode/train.json):

90,000 SFT examples for training
10,000 validation examples
Source: CodeXGLUE text-to-code (Java) dataset

Training Configuration

Epochs: 3
Batch Size: 8 × 6 gradient accumulation = 48 effective
Learning Rate: 2e-5
Max Length: 2048 tokens
Precision: float32 (for stability)
Optimizer: AdamW
Scheduler: Cosine with warmup
Early Stopping: Patience of 3 evaluations

Hardware

GPU: NVIDIA A100 80GB
Training Time: ~8 hours
Framework: PyTorch 2.0+ with Transformers

Usage

Installation

pip install transformers torch

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
model_name = "Naholav/llama-3.2-3b-100k-codeXGLUE-sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")

# Prepare prompt
task_description = "returns the sum of two integers"
prompt = f"""You are an expert Java programmer. Generate a complete, working Java method for the given description.

Task: {task_description}

Requirements:
- Write a complete Java method
- Use proper syntax and naming conventions
- Include return statements where needed
- Keep it concise but functional

```java
"""

# Generate code
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    temperature=0.2,
    do_sample=True,
    top_p=0.95,
    pad_token_id=tokenizer.eos_token_id
)

generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_code)

Expected Output Format

The model generates Java methods following this pattern:

public int sum(int a, int b) {
    return a + b;
}

Testing on Your Own Data

For local evaluation, you can use:

Test dataset from this project: 100 examples
Original Microsoft test set: 2k examples

Important: Remember to clean the natural language descriptions before inference:

def clean_nl(nl_description):
    cleaned = nl_description.replace("concode_field_sep", " | ")
    cleaned = cleaned.replace("concode_elem_sep", ", ")
    return ' '.join(cleaned.split())

Performance

The model was evaluated during training with:

Validation loss tracked every 1/5 epoch
Early stopping based on validation performance
Best checkpoint selected automatically

Comparison with Reflection Model

This is the standard SFT version. For comparison with the reflection-based training approach, see:

Reflection Model
GitHub Repository for implementation details

Limitations

Trained specifically for Java method generation
May not generalize well to full classes or other programming languages
Best suited for single-method generation tasks
Context window limited to 2048 tokens

Ethical Considerations

The model should not be used to generate malicious code
Generated code should be reviewed before use in production
Not suitable for generating code that handles sensitive data without proper review

Acknowledgments

Meta AI for the LLaMA 3.2 base model
Microsoft Research for the CodeXGLUE text-to-code (Java) dataset
Hugging Face for the training infrastructure

Naholav
/

llama-3.2-3b-100k-codeXGLUE-sft