|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- fr |
|
- de |
|
- es |
|
- pt |
|
- it |
|
- ja |
|
- ko |
|
- ru |
|
- zh |
|
- ar |
|
- fa |
|
- id |
|
- ms |
|
- ne |
|
- pl |
|
- ro |
|
- sr |
|
- sv |
|
- tr |
|
- uk |
|
- vi |
|
- hi |
|
- bn |
|
base_model: mistralai/Mistral-Small-3.1-24B-Instruct-2503 |
|
library_name: vllm |
|
inference: false |
|
|
|
--- |
|
|
|
# Aqui-VL 24B Mistral |
|
|
|
Aqui-VL 24B Mistral is an advanced language model based on Mistral Small 3.1, designed to deliver exceptional performance while remaining accessible on consumer-grade hardware. This is the first open weights model from Aqui Solutions, the company behind [AquiGPT](https://aquigpt.com.br). With 23.6 billion parameters, it can run efficiently on a single RTX 4090 GPU or a 32GB Mac, making cutting-edge AI capabilities available to researchers, developers, and enthusiasts. |
|
|
|
## Key Features |
|
|
|
- **Consumer Hardware Compatible**: Runs on single RTX 4090 or 32GB Mac |
|
- **Multimodal Capabilities**: Text, vision, chart analysis, and document understanding |
|
- **128K Context Window**: Handle long documents and complex conversations |
|
- **Strong Instruction Following**: Significantly improved over base Mistral Small 3.1 |
|
- **Exceptional Code Generation**: Best-in-class coding performance |
|
|
|
## Hardware Requirements |
|
|
|
### Minimum Requirements |
|
- **GPU**: RTX 4090 (24GB VRAM) or equivalent |
|
- **Mac**: 32GB unified memory (Apple Silicon recommended) |
|
- **RAM**: 32GB system memory (for GPU setups) |
|
- **Storage**: 20GB available space (for model and overhead) |
|
|
|
### Recommended Setup |
|
- **GPU**: RTX 4090 with adequate cooling |
|
- **CPU**: Modern multi-core processor |
|
- **RAM**: 64GB+ for optimal performance |
|
- **Storage**: NVMe SSD for faster model loading |
|
|
|
## Performance Benchmarks |
|
|
|
Aqui-VL 24B Mistral demonstrates competitive performance across multiple domains: |
|
|
|
| Benchmark | Aqui-VL 24B Mistral | Mistral Small 3.1 | Llama 3.1 70B | |
|
|-----------|------------------|-------------------|----------------| |
|
| **IFEval** (Instruction Following) | **88.3%** | 82.6% | 87.5% | |
|
| **MMLU** (General Knowledge) | 80.9% | 80.5% | **86.0%** | |
|
| **GPQA** (Science Q&A) | 44.7% | 44.4% | **46.7%** | |
|
| **HumanEval** (Coding) | **92.5%** | 88.9% | 80.5% | |
|
| **MATH** (Mathematics) | 69.3% | **69.5%** | 68.0% | |
|
| **MMMU** (General Vision) | **64.0%** | 62.5% | N/A* | |
|
| **ChartQA** (Chart Analysis) | **87.6%** | 86.2% | N/A* | |
|
| **DocVQA** (Document Analysis) | **94.9%** | 94.1% | N/A* | |
|
| **Average Text Performance** | **75.1%** | 73.2% | 73.7% | |
|
| **Average Vision Performance** | **82.2%** | 80.9% | N/A* | |
|
|
|
*Llama 3.1 70B does not include vision capabilities |
|
|
|
## Model Specifications |
|
|
|
- **Parameters**: 23.6 billion |
|
- **Context Window**: 128,000 tokens |
|
- **Knowledge Cutoff**: December 2023 |
|
- **Architecture**: mistral (transformer-based with vision) |
|
- **Languages**: Multilingual support with strong English, French and Portuguese performance |
|
|
|
## Installation & Usage |
|
|
|
### Quick Start with Transformers |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
import torch |
|
|
|
# Load model and tokenizer |
|
model_name = "aquigpt/aqui-vl-24b" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.float16, |
|
device_map="auto" |
|
) |
|
|
|
# Generate text |
|
prompt = "Explain quantum computing in simple terms:" |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_length=200, temperature=0.7) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
### Using with Ollama |
|
|
|
```bash |
|
# Pull the model |
|
ollama pull aquiffoo/aqui-vl-24b |
|
|
|
# Run interactive chat |
|
ollama run aquiffoo/aqui-vl-24b |
|
``` |
|
|
|
### Using with llama.cpp |
|
|
|
```bash |
|
# Download quantized model (Q4_K_M, 14.4GB) |
|
wget https://huggingface.co/aquigpt/aqui-vl-24b/resolve/main/aqui-vl-24b-q4_k_m.gguf |
|
|
|
# Run with llama.cpp |
|
./main -m aqui-vl-24b-q4_k_m.gguf -p "Your prompt here" -n 100 |
|
``` |
|
|
|
## Use Cases |
|
|
|
### Code Generation & Programming |
|
With an 88.9% score on HumanEval, Aqui-VL 24B Mistral excels at: |
|
- Writing clean, efficient code in multiple languages |
|
- Debugging and code review |
|
- Algorithm implementation |
|
- Technical documentation |
|
|
|
### Document & Chart Analysis |
|
Strong vision capabilities enable: |
|
- PDF document analysis and Q&A |
|
- Chart and graph interpretation |
|
- Scientific paper comprehension |
|
- Business report analysis |
|
|
|
### General Assistance |
|
- Research and information synthesis |
|
- Creative writing and content generation |
|
- Mathematical problem solving |
|
- Multilingual translation and communication |
|
|
|
## Quantization |
|
|
|
Aqui-VL 24B Mistral is available exclusively in Q4_K_M quantization, optimized for the best balance of performance and hardware compatibility: |
|
|
|
- **Format**: Q4_K_M quantization |
|
- **Size**: 14.4GB |
|
- **VRAM Usage**: ~16GB (with overhead) |
|
- **Compatible with**: RTX 4090, 32GB Mac, and similar hardware |
|
- **Performance**: Excellent quality retention with 4-bit quantization |
|
|
|
## Fine-tuning & Customization |
|
|
|
Aqui-VL 24B Mistral supports: |
|
- Parameter-efficient fine-tuning (LoRA, QLoRA) |
|
- Full fine-tuning for specialized domains |
|
- Custom tokenizer training |
|
- Multi-modal fine-tuning for specific vision tasks |
|
|
|
## Limitations |
|
|
|
- Knowledge cutoff at December 2023 |
|
- May occasionally produce hallucinations |
|
- Performance varies with quantization level |
|
- Requires significant computational resources for optimal performance |
|
|
|
## License |
|
|
|
This model is released under the [Apache 2.0 License](LICENSE), making it suitable for both research and commercial applications. |
|
|
|
## Support |
|
|
|
For questions and support regarding Aqui-VL 24B Mistral, please visit the [Hugging Face repository](https://huggingface.co/aquigpt/aqui-vl-24b) and use the community discussions section. |
|
|
|
## Acknowledgments |
|
|
|
Built upon the excellent foundation of Mistral Small 3.1 by Mistral AI. Special thanks to the open-source community for tools and datasets that made this model possible. |
|
|
|
--- |
|
|
|
*Copyright 2025 Aqui Solutions. All rights reserved* |