M3-V2: An Open Source Model for State-of-the-Art Code Generation

License Support via PayPal

M3-V2 is a state-of-the-art causal language model featuring a novel architecture that enables advanced reasoning and self-correction. This model is fully open source under the Apache 2.0 license, making it available for academic, personal, and commercial use.

The model achieves a groundbreaking 98.17% Pass@1 score on the HumanEval benchmark, placing it at the cutting edge of AI code generation and making it one of the most powerful open-source code generation engines available today.


Benchmark Performance

The benchmark results demonstrate a level of performance that significantly surpasses many publicly available models.

HumanEval Benchmark Chart

Performance Comparison

Model HumanEval Pass@1 Score Note
moelanoby/phi3-M3-V2 (This Model) 95.12% / 98.17% / 98.56% Apache 2.0 License. Scores correspond to 0, 1, and 2 self-correction passes, with 1 being the default.
GPT-4.5 / "Orion" ~96.00% Projected (Late 2025)
Gemini 2.5 Pro ~95.00% Projected (Late 2025)
Claude 4 ~94.00% Projected (Late 2025)

Support the Project

M3-V2 is an open-source project, free for everyone to use. I am passionate about creating powerful and accessible AI tools for the community.

If you find this model helpful in your work, research, or personal projects, please consider supporting its development. Your contribution helps cover training costs, allows me to dedicate more time to improvements, and fuels the creation of new open-source models. Every little bit helps and is greatly appreciated!

Support via PayPal


License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute this model and its source code for any purpose, including commercial applications, subject to the terms of the license. You can find a copy of the license in the repository.

Ethical Considerations

While this model is open source, users are encouraged to use it responsibly. Finetuning the model to generate harmful, illegal, or unethical content is strongly discouraged. I advocate for the use of this technology to build positive and safe applications.

And please don't put the architecture in any image generation AI models I love supporting real artists very much and it would be sad that it gets taken over by AI art :/


How to Use

use the installation guide AND the python implementation :]

Installation

First, ensure you have the necessary libraries installed:

pip install torch transformers accelerate

Python Implementation

You can easily integrate the model into your application. You must use trust_remote_code=True for the custom architecture to load correctly from the Hub.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "moelanoby/phi3-M3-V2"

print("Loading tokenizer and model...")
tokenizer = AutoTokenizer.from_pretrained(
    MODEL_ID, 
    trust_remote_code=True, 
)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
print("Model loaded successfully.")

# --- Controlling the model's self-correction feature ---
# Default is 1 pass. You can adjust it for different performance profiles.
try:
    target_layer_path = "model.layers.15.mlp.gate_up_proj" 
    custom_layer = model
    for part in target_layer_path.split('.'):
        custom_layer = getattr(custom_layer, part)
        
    # Set the number of self-correction passes (e.g., 0, 1, 2, or 3)
    custom_layer.num_correction_passes = 2 
    print(f"โœ… Number of self-correction passes set to: {custom_layer.num_correction_passes}")
except AttributeError:
    print("โš ๏ธ Could not access the custom layer. The model will run with its default settings.")

# (Example generation code would follow here)

Important Notes

  • Downside: The model might become more incoherent or less accurate as you add more self-correction passes. Experiment to find the best balance for your use case.
  • Recommendations: You can use 1, 2, or 3 self-correction passes if needed. 2 passes is the most recommended setting for a balance of performance and coherence.

Acknowledgements

  • The base of this model utilizes the Phi-3 architecture developed by Microsoft.
  • The benchmark results were obtained using the HumanEval dataset from OpenAI.
  • I thank the open-source community for their continuous contributions to AI research.
Downloads last month
111
Safetensors
Model size
3.82B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for moelanoby/phi-3-M3-coder

Finetuned
(293)
this model