Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.17115

ProX General Models

base models trained on ProX curated data.

gair-prox/FW-ProX-1.7B

Text Generation • Updated Sep 28, 2024 • 8 • 3
gair-prox/RedPJ-ProX-0.7B

Updated Oct 10, 2024 • 13 • 1
gair-prox/ProX-RedPJ-1.7B-25B

Updated Sep 16, 2024 • 6
gair-prox/RedPJ-ProX-0.3B

Updated Oct 10, 2024 • 7 • 2

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15, 2024 • 42
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 78
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 188
MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Paper • 2403.02884 • Published Mar 5, 2024 • 17

about 23 hours ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 13
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 57
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 47

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 23
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 17
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 10
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 12

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs