3 13 5

Ofir Zafrir

ofirzaf

AI & ML interests

Sparsity, Qunatization, Model Compression

Recent Activity

upvoted an article 13 days ago

updated a model about 1 month ago

ofirzaf/hebrew-math-tutor-v1-W4A16-G128

new activity about 1 month ago

OpenVINO/Phi-3-mini-FastDraft-50M-int8-sym-ov:Update README.md

View all activity

Organizations

upvoted an article 13 days ago

Article

Getting More from Your Test-Time Compute Budget with Portfolio Beam Search

19 days ago

•

upvoted a paper 2 months ago

Prune Once for All: Sparse Pre-Trained Language Models

Paper • 2111.05754 • Published Nov 10, 2021 • 2

upvoted an article 3 months ago

Article

DeepMath: A lightweight math reasoning Agent with smolagents

Dec 4, 2025

•

upvoted 2 articles 6 months ago

Article

Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models

Sep 29, 2025

•

Article

Breaking Language Barriers in Mathematical AI: Introducing Hebrew Math Tutor

Sep 7, 2025

•

upvoted an article 11 months ago

Article

Introducing HELMET: Holistically Evaluating Long-context Language Models

Apr 16, 2025

•

upvoted an article 12 months ago

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

Mar 24, 2025

•

upvoted a paper about 1 year ago

SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models

Paper • 2502.09390 • Published Feb 13, 2025 • 16

upvoted a collection about 1 year ago

Speculative Decoding Draft Models

Collection

Collection of OpenVINO optimized efficient draft models for speculative decoding • 5 items • Updated Jan 15 • 10

upvoted an article about 1 year ago

Article

A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake

Mar 20, 2024

•

upvoted 2 papers over 1 year ago

FastDraft: How to Train Your Draft

Paper • 2411.11055 • Published Nov 17, 2024 • 11

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation

Paper • 2408.02545 • Published Aug 5, 2024 • 40

Ofir Zafrir

AI & ML interests

Recent Activity

Organizations

ofirzaf's activity

Getting More from Your Test-Time Compute Budget with Portfolio Beam Search

DeepMath: A lightweight math reasoning Agent with smolagents

Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models

Breaking Language Barriers in Mathematical AI: Introducing Hebrew Math Tutor

Introducing HELMET: Holistically Evaluating Long-context Language Models

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

A Chatbot on your Laptop: Phi-2 on Intel Meteor Lake