Daedalus-1-2B

Daedalus-1-2B is a 2 billion parameter code reasoning model developed by Noema Research. It is based on DeepCoder-1.5B-Preview and optimized for advanced code generation, debugging, and algorithmic reasoning.

This model represents the entry-level member of the Daedalus series, balancing performance and efficiency for a broad range of software engineering tasks.

Model Overview

Base model: DeepCoder-1.5B-Preview
Architecture: Decoder-only transformer
Parameters: ~2B
Context length: up to 64k tokens
Domain: Code reasoning and generation
Primary applications:
- Code completion and synthesis
- Debugging and error detection
- Algorithm design and explanation
- Educational tools and coding assistants
License: MIT

Key Features

Instruction tuning for reliable multi-step reasoning and task completion
Extended context handling, supporting up to 64k tokens
Multilingual support, including Python, C++, Java, JavaScript, Go, Rust, and more
Reinforcement learning optimization for improved code generation accuracy
Efficient deployment, suitable for both cloud and edge environments

Usage

The model is available in Hugging Face Transformers format. Example:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "NoemaResearch/Daedalus-1-2B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

prompt = "Write a Python function to merge two sorted lists into a single sorted list."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.9)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Recommended settings:

temperature=0.5–0.8
top_p=0.9–0.95
Lower temperatures yield more deterministic and concise code completions

Evaluation

Daedalus-1-2B demonstrates strong performance in code reasoning tasks, with internal evaluations indicating:

High accuracy on code completion and synthesis tasks
Robust debugging capabilities, identifying and suggesting fixes for common errors
Effective handling of complex algorithmic problems

A full benchmark report will be provided in a future update. For upstream performance details, see the DeepCoder-1.5B-Preview model card.

Limitations

Reasoning scale: While effective for many tasks, Daedalus-1-2B may not match larger models (e.g., 4B+) on highly complex or open-ended coding problems
Knowledge breadth: Some specialized or domain-specific knowledge may be limited
Hallucinations: May generate plausible but incorrect code or explanations
Prompt sensitivity: Outputs remain dependent on careful prompt formulation

Responsible Use

Do not rely on Daedalus-1-2B for critical software development tasks without human oversight
Verify all generated code before deploying in production environments
Avoid providing personal or sensitive data in prompts
The model should not be used to generate unsafe, harmful, or disallowed content

Model Variants

Full precision (safetensors) — research and high-fidelity inference
bf16 / fp16 — efficient inference on modern accelerators
Quantized versions (int8 / int4) — deployment in resource-constrained environments

Citation

If you use this model, please cite both Daedalus-1-2B and the DeepCoder base model:

@misc{noema2025daedalus2b,
  title={Daedalus-1-2B},
  author={Noema Research},
  year={2025},
  howpublished={\url{https://huggingface.co/NoemaResearch/Daedalus-1-2B}}
}