jttw-ai-docker-stack-language-models
This is a collection of models that jttw-ai-docker-stack automatically downloads during setup of the inference server
3B • Updated • 556 • 2Note Context Window Size: 128,000 tokens Quant: Q8_0 Size: 3.42 GB Specializations: -- Instruction following -- Complex reasoning -- Tool use and function calling -- Multilingual dialogue (supports at least 8 languages) -- Text generation and summarization -- Efficient edge and mobile deployment
mradermacher/Phi-3.5-mini-instruct-GGUF
4B • Updated • 90 • 2Note Context Window Size: 128,000 tokens Quant: Q8_0 Size: 4.06 GB Specializations: -- Long-context understanding (summarization, Q&A, information retrieval) -- Strong reasoning in code, math, and logic -- Multilingual dialogue and comprehension (supports 20+ languages) -- Efficient code generation (Python, C++, Rust, Java, TypeScript) -- High performance in memory- and compute-constrained environments -- Robust instruction following and safe conversational AI
unsloth/Qwen3-0.6B-GGUF
Text Generation • 0.6B • Updated • 33k • 71Note Context Window Size: 32,768 tokens Quant: Q8_0 Size: 639 MB Specializations: -- Hybrid thinking modes: deep reasoning (thinking mode) and fast responses (non-thinking mode) -- Multilingual proficiency -- Strong performance in math, coding, and logical reasoning tasks -- Reliable instruction-following and agent tool integration -- Optimized for lightweight, efficient deployment in constrained environments -- Enhanced context awareness for multi-turn dialogue and document-based tasks
unsloth/Qwen3-1.7B-GGUF
Text Generation • 2B • Updated • 26.4k • 36Note Context Window Size: 32,768 tokens Quant: Q8_0 Size: 1.83 GB Specializations: -- Dual reasoning and fast-response modes -- Strong math and code generation -- Multilingual support (100+ languages) -- Efficient, lightweight deployment -- Reliable multi-turn dialogue
unsloth/Qwen3-4B-128K-GGUF
Text Generation • 4B • Updated • 2.78k • 22Note Context Window Size: 131,072 tokens Quant: Q8_0 Size: 4.28 GB Specializations: -- Dual "thinking" and fast-response modes -- Strong reasoning, math, and code generation -- Multilingual support (100+ languages) -- Agent tool integration -- Creative writing and multi-turn dialogue
mradermacher/Qwen2.5-7B-Instruct-GGUF
8B • Updated • 350 • 3Note Context Window Size: 131,072 tokens Quant: Q8_0 Size: 8.1 GB Specializations: -- Strong reasoning and logical tasks -- Code generation and math -- Multilingual support -- Instruction following -- Efficient, quantized deployment
unsloth/Qwen3-8B-128K-GGUF
Text Generation • 8B • Updated • 3.29k • 16Note Context Window Size: 131,072 tokens Quant: Q8_0 Size: 8.71 GB Specializations: -- Dual "thinking" and fast-response modes -- Strong reasoning, math, and code generation -- Multilingual support (100+ languages) -- Agent tool integration -- Creative writing and multi-turn dialogue
unsloth/Qwen3-14B-128K-GGUF
Text Generation • 15B • Updated • 4.31k • 19Note Context Window Size: 128,000 tokens Quant: Q4_K_M Size: 9 GB Specializations: -- Dual modes: advanced reasoning and fast, efficient dialogue -- Strong instruction following: excels at creative writing, role-play, and multi-turn chat -- Agentic tool use: precise integration with external tools -- Multilingual support -- Long-context: processes up to 128,000 tokens for extended documents and conversations -- Efficient deployment: optimized for speed and low memory use