aquif-moe-400m

aquif-moe-400m is our compact Mixture of Experts (MoE) model, with only 400 million active parameters. It offers impressive performance-per-VRAM efficiency, making it a strong choice for resource-limited setups.

Model Overview

  • Name: aquif-moe-400m
  • Parameters: 400 million active parameters (1.3 billion total)
  • Context Window: 128,000 tokens
  • Architecture: Mixture of Experts (MoE)
  • Type: General-purpose LLM
  • Hosted on: Ollama, Huggingface

Key Features

  • Highly efficient VRAM utilization (77.0 performance points per GB)
  • Expansive 128K token context window for handling long documents
  • Competitive performance despite fewer parameters
  • Optimized for local inference on consumer hardware
  • Ideal for resource-constrained environments
  • Supports high-throughput concurrent sessions

Performance Benchmarks

aquif-moe-400m delivers solid performance across multiple benchmarks, especially for its size:

Benchmark aquif-moe (0.4b) Qwen 2.5 (0.5b) Gemma 3 (1b)
MMLU 26.6 45.4 26.5
HumanEval 32.3 22.0 8.1
GSM8K 33.9 36.0 6.1
Average 30.9 34.4 11.3

VRAM Efficiency

aquif-moe-400m excels in VRAM efficiency:

Model Average Performance VRAM (GB) Performance per VRAM
aquif-moe 30.9 0.4 77.3
Qwen 2.5 34.4 0.6 57.3
Gemma 3 11.3 1.0 11.3

Use Cases

  • Edge computing and resource-constrained environments
  • Mobile and embedded applications
  • Local development environments
  • Quick prototyping and testing
  • Personal assistants on consumer hardware
  • Enterprise deployment with multiple concurrent sessions
  • Long document analysis and summarization
  • High-throughput production environments

Limitations

  • No thinking mode capability
  • May show hallucinations in some areas
  • May struggle with more complex reasoning tasks
  • Not optimized for specialized domains

Getting Started

To run via Ollama:

ollama run aquiffoo/aquif-moe-400m
``´
Downloads last month
34
Safetensors
Model size
1.33B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aquiffoo/aquif-moe-400m

Finetuned
(3)
this model
Quantizations
2 models

Collection including aquiffoo/aquif-moe-400m