Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Testerpce 's Collections
LLM judge
Test time
3D
Physics and operators
Materials and structures
Vision Language Action models
Vision
World model
Code
Compression
Data
Process Reward Modelling
Memory
SAE
Applications and Uses
Theory and Representation learning
Adversarial
Graph
Multimodal
Search
Interpretable
Diversity
Diffusion
Self correction
Information_retrieval
Speech
Attention
Synthetic data
Agent
MoE
RAG
Markov chain
Prompt papers
Planning
Sparsity
Multilingual
State space LLM
Partial layer training LLMs
Reasoning
Evaluation
Fine tuning
Math
Dataset and Data processing
Style transfer
Video understanding
Reinforcement learning
Long context
Knowledge

Data

updated 3 days ago
Upvote
-

  • Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

    Paper • 2506.19290 • Published 27 days ago • 50

  • Data Efficacy for Language Model Training

    Paper • 2506.21545 • Published 25 days ago • 10

  • Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

    Paper • 2507.04009 • Published 16 days ago • 31

  • RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs

    Paper • 2507.03253 • Published 17 days ago • 18

  • Scaling Laws for Optimal Data Mixtures

    Paper • 2507.09404 • Published 9 days ago • 32
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs