ngxson and friends

community

Activity Feed

AI & ML interests

Making SOTA quantization while skipping courses

ngxson-and-friends's activity

ngxson

posted an update 3 months ago

Post

3659

A comprehensive matrix for which format should you use.

Read more on my blog post: https://huggingface.co/blog/ngxson/common-ai-model-formats

| Hardware        | GGUF      | PyTorch                | Safetensors              | ONNX  |
|-----------------|-----------|------------------------|--------------------------|-------|
| CPU             | ✅ (best) | 🟡                      | 🟡                       | ✅    |
| GPU             | ✅        | ✅                      | ✅                       | ✅    |
| Mobile          | ✅        | 🟡 (via executorch)     | ❌                       | ✅    |
| Apple silicon   | ✅        | 🟡                      | ✅ (via MLX framework)   | ✅    |

1 reply

ngxson

authored a paper 3 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 229

ngxson

posted an update 4 months ago

Post

1075

Fun fact: you can get any DeepSeek-R1-Qwen **abliterated** by using one of these LoRA adapters (GGUF available!)

ngxson/extracted-lora-mergekit-677d5c3eea0b6a7661201846

ngxson

posted an update 4 months ago

Post

3595

Check out my collection of pre-made GGUF LoRA adapters!

This allow you to use both normal + abliterated version of popular models like llama, qwen, etc, without having to double to amount of VRAM usage.

ngxson/gguf_lora_collection

5 replies

ngxson

posted an update 4 months ago

Post

3478

I made this small tool that can be useful for debugging Ollama chat template: ngxson/ollama_template_test

CC @bartowski you may need this ;-)

2 replies

AI & ML interests

Team members 2

ngxson-and-friends's activity