OLMo Code SFT - 7B Model

This is a LoRA adapter for the allenai/OLMo-2-1124-7B-Instruct model, fine-tuned for Python code generation and instruction following.

Model Details

Model Description

Developed by: OLMo Code SFT Team
Model type: LoRA Adapter for Causal Language Model
Language(s): Python, English
License: Same as base model (allenai/OLMo-2-1124-7B-Instruct)
Finetuned from model: allenai/OLMo-2-1124-7B-Instruct

Model Sources

Base Model: allenai/OLMo-2-1124-7B-Instruct

Uses

Direct Use

This model is designed for Python code generation tasks, including:

Code completion
Function generation
Bug fixing
Code explanation
Instruction following

Downstream Use

The model can be used as a base for further fine-tuning on specific code-related tasks.

Out-of-Scope Use

Not suitable for production code generation without additional safety measures
Not designed for non-Python programming languages
Not intended for general text generation outside of code contexts

Bias, Risks, and Limitations

The model may generate code with security vulnerabilities
Output should be reviewed before execution
May inherit biases from the base model and training data

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-7B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-2-1124-7B-Instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "olmo-code-sft-7b-lr0.0005")

# Generate code
prompt = "Write a Python function to calculate fibonacci numbers"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on Python code data with instruction-response pairs.

Training Procedure

Training Hyperparameters

Training regime: LoRA fine-tuning
Learning rate: 0.0005
LoRA rank: 64
LoRA alpha: 128
LoRA dropout: 0.05
Target modules: q_proj, k_proj, o_proj, down_proj, up_proj, gate_proj, v_proj

Speeds, Sizes, Times

Model size: 7B
Training time: Varies by experiment
Checkpoint size: LoRA adapter only (~2GB)

Evaluation

The model was evaluated on Python code generation tasks with focus on:

Code quality
Instruction following
Python syntax correctness

Technical Specifications

Model Architecture and Objective

Architecture: LoRA adapter on top of allenai/OLMo-2-1124-7B-Instruct
Objective: Causal language modeling for code generation
Task type: CAUSAL_LM

Compute Infrastructure

Hardware: GPU cluster
Software: PEFT, Transformers, PyTorch

Citation

If you use this model, please cite:

@misc{olmo-code-sft-7b,
  author = {OLMo Code SFT Team},
  title = {OLMo Code SFT - 7B Model},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/olmo-code-sft-7b-lr0.0005}},
}

Model Card Authors

OLMo Code SFT Team

Model Card Contact

For questions about this model, please open an issue in the repository.

dipikakhullar
/

olmo-code-sft-7b-lr0.0005