featherless-ai-quants
/

rinna-qwen2.5-bakeneko-32b-instruct-v2-GGUF

Text Generation

Model card Files Files and versions Community

rinna/qwen2.5-bakeneko-32b-instruct-v2 GGUF Quantizations 🚀

Optimized GGUF quantization files for enhanced model performance

Powered by Featherless AI - run any model you'd like for a simple small fee.

Available Quantizations 📊

Quantization Type	File	Size
IQ4_XS	rinna-qwen2.5-bakeneko-32b-instruct-v2-IQ4_XS.gguf	17042.26 MB
Q2_K	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q2_K.gguf	11742.69 MB
Q3_K_L	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q3_K_L.gguf	16448.10 MB
Q3_K_M	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q3_K_M.gguf	15196.85 MB
Q3_K_S	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q3_K_S.gguf	13725.60 MB
Q4_K_M	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q4_K_M.gguf	18931.71 MB
Q4_K_S	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q4_K_S.gguf	17914.21 MB
Q5_K_M	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q5_K_M.gguf	22184.52 MB
Q5_K_S	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q5_K_S.gguf	21589.52 MB
Q6_K	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q6_K	25640.64 MB (folder)
Q8_0	rinna-qwen2.5-bakeneko-32b-instruct-v2-Q8_0	33207.79 MB (folder)

⚡ Powered by Featherless AI

Key Features

🔥 Instant Hosting - Deploy any Llama model on HuggingFace instantly
🛠️ Zero Infrastructure - No server setup or maintenance required
📚 Vast Compatibility - Support for 2400+ models and counting
💎 Affordable Pricing - Starting at just $10/month

Links:
Get Started | Documentation | Models

Downloads last month: 43

GGUF

Model size

32.8B params

Architecture

qwen2

Hardware compatibility

Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for featherless-ai-quants/rinna-qwen2.5-bakeneko-32b-instruct-v2-GGUF

Base model

rinna/qwen2.5-bakeneko-32b-instruct-v2

Quantized

(7)

this model