Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Paper • 2506.10952 • Published 11 days ago • 23
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Paper • 2506.10952 • Published 11 days ago • 23
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Paper • 2506.10952 • Published 11 days ago • 23 • 3
Learning Dynamics in Continual Pre-Training for Large Language Models Paper • 2505.07796 • Published May 12 • 19 • 4
Learning Dynamics in Continual Pre-Training for Large Language Models Paper • 2505.07796 • Published May 12 • 19
Learning Dynamics in Continual Pre-Training for Large Language Models Paper • 2505.07796 • Published May 12 • 19
Learning Dynamics in Continual Pre-Training for Large Language Models Paper • 2505.07796 • Published May 12 • 19 • 4
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 84