When Do Neural Nets Outperform Boosted Trees on Tabular Data? Paper • 2305.02997 • Published May 4, 2023
MARVIS: Modality Adaptive Reasoning over VISualizations Paper • 2507.01544 • Published 7 days ago • 11
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published 14 days ago • 59
How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions Paper • 2506.16679 • Published 20 days ago • 1
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 42
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models Paper • 2310.18208 • Published Oct 27, 2023
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations Paper • 2303.09289 • Published Mar 16, 2023 • 2
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge Paper • 2309.11575 • Published Sep 20, 2023
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Paper • 2305.15296 • Published May 24, 2023 • 1
Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness? Paper • 2305.18398 • Published May 28, 2023 • 2
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Paper • 2209.08891 • Published Sep 19, 2022 • 2
The Stable Artist: Steering Semantics in Diffusion Latent Space Paper • 2212.06013 • Published Dec 12, 2022 • 1
LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment Paper • 2406.05113 • Published Jun 7, 2024 • 3
AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation Paper • 2301.08110 • Published Jan 19, 2023 • 1