2xQwen2.5-Coder-3B-Cyclops-Aux

This model is a fine-tuned version of 2 Qwen/Qwen2.5-Coder-3B using Multi-LLM Group Relative Policy Optimization (MLGRPO) on HumanEval dataset.

"Cyclops" relies heavily on its aux() function for core implementation, while the main function adds edge case handling and refinements — just like a cyclops wielding power through its single eye.

Model Details

Base Model: Qwen/Qwen2.5-Coder-3B
Training Method: MLGRPO (Multi-LLM Group Relative Policy Optimization)
Dataset: HumanEval
Task: Code generation with auxiliary function collaboration

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LovelyBuggies/2xQwen2.5-Coder-3B-Cyclops-Aux")
model = AutoModelForCausalLM.from_pretrained("LovelyBuggies/2xQwen2.5-Coder-3B-Cyclops-Aux")

# Generate code
inputs = tokenizer(aux_prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
aux_completion = tokenizer.decode(outputs[0], skip_special_tokens=True).strip()
cleaned_aux_completion = extract_specific_function(cleanup_code(aux_completion), "aux")
print(cleaned_aux_completion)

Training Details

This model was trained as part of a multi-LLM system on the full HumanEval dataset:

Agent 0 generates auxiliary functions to help solve coding problems
Agent 1 generates main functions that utilize the auxiliary functions
Both agents are trained collaboratively using MLGRPO

Agent Role

This is the Auxiliary Function Generator agent that creates helper functions to assist in solving coding problems.
It provides the primary solution and works in collaboration with the main function generator 2xQwen2.5-Coder-3B-Cyclops-Main to provide a complete solution.

Citation

If you use this model, please cite:

Coming soon.