LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23
Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes Paper • 2503.18155 • Published Mar 23
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024
VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification Paper • 2505.20302 • Published 29 days ago
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models Paper • 2310.18208 • Published Oct 27, 2023
When Do Neural Nets Outperform Boosted Trees on Tabular Data? Paper • 2305.02997 • Published May 4, 2023
Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing Paper • 2407.04180 • Published Jul 4, 2024
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification Paper • 2410.05057 • Published Oct 7, 2024 • 7
PITCH: AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response Paper • 2402.18085 • Published Feb 28, 2024
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published Jan 30 • 20
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published Jan 30 • 20
Hidden in the Noise: Two-Stage Robust Watermarking for Images Paper • 2412.04653 • Published Dec 5, 2024 • 31
Hidden in the Noise: Two-Stage Robust Watermarking for Images Paper • 2412.04653 • Published Dec 5, 2024 • 31
Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity Paper • 2406.17720 • Published Jun 25, 2024 • 8
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification Paper • 2410.05057 • Published Oct 7, 2024 • 7
Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking Paper • 2409.15268 • Published Sep 23, 2024 • 13