TimeCapsule Gemma 3n 2.7B Slice (FP16 GGUF)

This model is a 2.7 B parameter sub‑model of Gemma 3n, created using the MatFormer (Matryoshka Transformer) architecture and the Mix‑n‑Match slicing approach. It was sliced from the E4B checkpoint using the official E2.69B (layer‑level) configuration.


🧠 Intended Use

  • Primary use: High‑precision inference with Ollama via FP16 GGUF.
  • Best suited for: TimeCapsule‑SLM deep‑research workflows where latency, accuracy, and compute tradeoffs matter.

⚠️ Limitations & Considerations

  • Derived from a larger model — may not match the full E4B model in some evaluations.
  • Operates in FP16 precision — requires hardware (like A100/GPU or Ollama host) with FP16 support.
  • No additional quantization applied, preserving accuracy at some memory cost.

🛠 Creation Details

  • Parent model: google/gemma-3n-E4B-it
  • Slice configuration: Config for E2.69B (layer-level) from the official slicing-configs dataset
  • Converted from .safetensors to FP16 GGUF using llama.cpp’s convert_hf_to_gguf.py
  • Uploaded to this repository as: tc_mixmatch_f16.gguf

🧪 Usage Example

ollama run hf.co/bubblspace/Timecapsule2.7B-g3n-mix-match-gguf
Downloads last month
6
GGUF
Model size
5.56B params
Architecture
gemma3n
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bubblspace/Timecapsule2.7B-g3n-mix-match-gguf

Quantized
(44)
this model