AI & ML interests

LLMs, optimization, compression, sparsification, quantization, pruning, distillation, NLP, CV

Recent Activity

neuralmagic 's collections 14

INT4 LLMs for vLLM
Accurate INT4 quantized models by Neural Magic, ready for use with vLLM!
Compression Papers
Papers that we're proud to integrate into our libraries
Sparse Finetuning MPT
Explore our breakthrough in sparse fine-tuning LLMs! Our novel method maintains downstream accuracy even with >70% sparsity.
INT8 LLMs for vLLM
Accurate INT8 quantized models by Neural Magic, ready for use with vLLM!
Sparse Foundational Llama 2 Models
Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras
DeepSparse Sparse LLMs
Useful LLMs for DeepSparse where we've pruned at least 50% of the weights!
INT8 LLMs for vLLM
Accurate INT8 quantized models by Neural Magic, ready for use with vLLM!
INT4 LLMs for vLLM
Accurate INT4 quantized models by Neural Magic, ready for use with vLLM!
Sparse Foundational Llama 2 Models
Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras
Compression Papers
Papers that we're proud to integrate into our libraries
DeepSparse Sparse LLMs
Useful LLMs for DeepSparse where we've pruned at least 50% of the weights!
Sparse Finetuning MPT
Explore our breakthrough in sparse fine-tuning LLMs! Our novel method maintains downstream accuracy even with >70% sparsity.