Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
neuralmagic 's Collections
DeepSeek-R1-Distill Quantized
Granite 3.1 Quantization
Sparse-Llama-3.1-2of4
Vision Language Models Quantization
FP8 LLMs for vLLM
Llama-3.2 Quantization
Llama-3.1 Quantization
INT8 LLMs for vLLM
INT4 LLMs for vLLM
Sparse Foundational Llama 2 Models
Compression Papers
DeepSparse Sparse LLMs
Sparse Finetuning MPT
Compressed LLMs from the Community

Compression Papers

updated Sep 26, 2024

Papers that we're proud to integrate into our libraries

Upvote
-

  • Sparse Finetuning for Inference Acceleration of Large Language Models

    Paper • 2310.06927 • Published Oct 10, 2023 • 14

  • SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

    Paper • 2301.00774 • Published Jan 2, 2023 • 3

  • The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

    Paper • 2203.07259 • Published Mar 14, 2022 • 4

  • How Well Do Sparse Imagenet Models Transfer?

    Paper • 2111.13445 • Published Nov 26, 2021 • 1

  • Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

    Paper • 2405.03594 • Published May 6, 2024 • 7
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs