LSHBloom: Memory-efficient, Extreme-scale Document Deduplication Paper • 2411.04257 • Published Nov 6, 2024
Understanding The Effectiveness of Lossy Compression in Machine Learning Training Sets Paper • 2403.15953 • Published Mar 23, 2024