Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Ankit Sharma's picture

25 76

Ankit Sharma

nezubn

kirch's profile picture

Gargaz's profile picture

Mi6paulino's profile picture

·

https://nezubn.com

nezubn
ankisharma07
nezubn
nezubn.com

AI & ML interests

engineering • systems • ml

Organizations

nezubn 's collections 8

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 41

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13, 2024 • 51
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29, 2024 • 34
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2, 2024 • 46

how to evaluate LLMs

MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries

Paper • 2401.15391 • Published Jan 27, 2024 • 6
Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

reinforcement learning

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published May 14, 2024 • 19
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20, 2024 • 41

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 129
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 66

The case for 4-bit precision: k-bit Inference Scaling Laws

Paper • 2212.09720 • Published Dec 19, 2022 • 3

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 103

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Jan 12 • 2.23M • 462

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 41

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 129
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 66

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13, 2024 • 51
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111
Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29, 2024 • 34
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2, 2024 • 46

The case for 4-bit precision: k-bit Inference Scaling Laws

Paper • 2212.09720 • Published Dec 19, 2022 • 3

how to evaluate LLMs

MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries

Paper • 2401.15391 • Published Jan 27, 2024 • 6
Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 103

reinforcement learning

Understanding the performance gap between online and offline alignment algorithms

Paper • 2405.08448 • Published May 14, 2024 • 19
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Paper • 2405.11143 • Published May 20, 2024 • 41

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Jan 12 • 2.23M • 462

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs