admarcosai 's Collections Datasets
updated
Beyond Human Data: Scaling Self-Training for Problem-Solving with
Language Models
Paper
• 2312.06585
• Published
• 29
TinyGSM: achieving >80% on GSM8k with small language models
Paper
• 2312.09241
• Published
• 39
Viewer
• Updated
• 70k • 1.44k
• 92
Paper
• 2309.17425
• Published
• 6
jondurbin/gutenberg-dpo-v0.1
Viewer
• Updated
• 918 • 536
• 158
garage-bAInd/Open-Platypus
Viewer
• Updated
• 24.9k • 5.33k
• 415
Viewer
• Updated
• 243k • 1.1k
• 219
Viewer
• Updated
• 58.7k • 394
• 46
Viewer
• Updated
• 1.49M • 803
• 153
Viewer
• Updated
• 166k • 667
• 118
Viewer
• Updated
• 198k • 52
• 112
Viewer
• Updated
• 2.75M • 5.03k
• 381
Viewer
• Updated
• 6.2M • 1.42k
• 102
open-web-math/open-web-math
Viewer
• Updated
• 6.32M • 12.2k
• 329
Viewer
• Updated
• 4.04k • 484k
• 217
Viewer
• Updated
• 14.3k • 1.12k
• 51
Viewer
• Updated
• 44.8k • 122
• 53
Viewer
• Updated
• 6.14k • 13.1k
• 204
Viewer
• Updated
• 262k • 3.65k
• 299
argilla/ultrafeedback-binarized-preferences-cleaned
Viewer
• Updated
• 60.9k • 3.65k
• 160
WhiteRabbitNeo/Code-Functions-Level-Cyber
Viewer
• Updated
• 8.44k • 75
• 31
WhiteRabbitNeo/Code-Functions-Level-General
Viewer
• Updated
• 8.69k • 33
• 20
Viewer
• Updated
• 317k • 2.01k
• 33
Updated
• 1.41k
• 132
Viewer
• Updated
• 183k • 1.13k
• 295
selfrag/selfrag_train_data
Viewer
• Updated
• 146k • 116
• 75
Viewer
• Updated
• 463k • 24
• 18
Locutusque/UltraTextbooks
Viewer
• Updated
• 5.52M • 1.67k
• 198
Undi95/ConversationChronicles-sharegpt-SHARDED
Viewer
• Updated
• 787k • 68
• 10
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Paper
• 2402.10176
• Published
• 38
Viewer
• Updated
• 31.1M • 14.9k
• 675
togethercomputer/RedPajama-Data-1T
Viewer
• Updated
• 1.73M • 2.3k
• 1.14k
Viewer
• Updated
• 968M • 12.9k
• 893
Viewer
• Updated
• 276M • 10.6k
• 165
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Paper
• 2412.14475
• Published
• 57