AdvRahul/Axion-4B

A safety-enhanced version of Qwen3-4B-Instruct, optimized for reliable and responsible AI applications. πŸ›‘οΈ

Axion-4B is a fine-tuned version of the powerful Qwen/Qwen3-4B-Instruct-2507 model. The primary enhancement in this version is its robust safety alignment, making it a more dependable choice for production environments and user-facing applications.

πŸš€ Model Details

  • Model Creator: AdvRahul
  • Base Model: Qwen/Qwen3-4B-Instruct-2507
  • Fine-tuning Focus: Enhanced Safety & Harmlessness via Red-Teaming
  • Architecture: Qwen3
  • Context Length: 262,144 tokens
  • License: Based on the Tongyi Qianwen LICENSE of the original model.

πŸ“ Model Description

Enhanced for Safety

The core purpose of Axion-4B is to provide a safer alternative for developers. The base model underwent extensive red-team testing using advanced protocols to significantly minimize the generation of harmful, biased, or inappropriate content.

Powerful Core Capabilities

While adding a crucial safety layer, Axion-4B retains the exceptional capabilities of its base model, including:

  • Strong Logical Reasoning: Excels at complex problems in math, science, and logic.
  • Advanced Instruction Following: Reliably adheres to user commands and constraints.
  • Multi-lingual Knowledge: Covers a wide range of languages and cultural contexts.
  • Massive 256K Context Window: Capable of understanding and processing very long documents.
  • Excellent Coding & Tool Use: Proficient in code generation and agentic tasks.

πŸ’» Quickstart

You can use this model directly with the transformers library (version 4.51.0 or newer is recommended).

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# IMPORTANT: Use the model name for this repository
model_name = "AdvRahul/Axion-4B"

# Load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Prepare the model input
prompt = "Give me a short introduction to large language models and their safety considerations."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate text
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512 # Limiting for a concise example
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

content = tokenizer.decode(output_ids, skip_special_tokens=True)
print("Response:", content)

Optimized Deployment
For high-throughput, production-ready deployment, you can use frameworks like vLLM or SGLang to serve the model via an OpenAI-compatible API.

vLLM:

Bash

vllm serve AdvRahul/Axion-4B --max-model-len 262144
SGLang:

Bash

python -m sglang.launch_server --model-path AdvRahul/Axion-4B --context-length 262144
Note: If you encounter out-of-memory (OOM) issues, consider reducing the max context length (e.g., --max-model-len 32768).

⚠️ Ethical Considerations and Limitations
This model was fine-tuned with the explicit goal of improving safety and reducing harmful outputs. However, no AI model is completely immune to risks.

No Guarantees: While the safety alignment is significantly improved, it does not guarantee perfectly harmless outputs in all scenarios.

Inherited Biases: The model may still reflect biases present in the vast amount of data used to train its base model.

Factual Accuracy: Always fact-check critical information, as the model can generate plausible but incorrect statements.

Best Practice: It is strongly recommended that developers implement their own content moderation filters and safety guardrails as part of a comprehensive, defense-in-depth strategy. Thoroughly evaluate the model's performance and safety for your specific use case before deploying it to a live audience.
Downloads last month
20
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AdvRahul/Axion-4B

Quantized
(102)
this model