Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective Paper • 2506.19028 • Published 3 days ago • 1
OAgents: An Empirical Study of Building Effective Agents Paper • 2506.15741 • Published 9 days ago • 31
ConsumerBench: Benchmarking Generative AI Applications on End-User Devices Paper • 2506.17538 • Published 6 days ago • 6
Steering Conceptual Bias via Transformer Latent-Subspace Activation Paper • 2506.18887 • Published 3 days ago • 6
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies Paper • 2506.17673 • Published 5 days ago • 6
SoK: Evaluating Jailbreak Guardrails for Large Language Models Paper • 2506.10597 • Published 14 days ago • 3
SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions Paper • 2506.00643 • Published 26 days ago • 5
Synthesizing Conversations from Unlabeled Documents using Automatic Response Segmentation Paper • 2406.03703 • Published Jun 6, 2024 • 2
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey Paper • 2402.17944 • Published Feb 27, 2024 • 1
HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent Paper • 2402.01018 • Published Feb 1, 2024 • 1
DeTiME: Diffusion-Enhanced Topic Modeling using Encoder-decoder based LLM Paper • 2310.15296 • Published Oct 23, 2023 • 3