-
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
Paper • 2406.12624 • Published • 38 -
A Survey on LLM-as-a-Judge
Paper • 2411.15594 • Published -
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Paper • 2412.05579 • Published • 1 -
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Paper • 2411.16594 • Published • 41
LZhang
kaitou951
·
AI & ML interests
None yet
Recent Activity
updated
a collection
3 days ago
Daily Papers
updated
a collection
3 days ago
Daily Papers
updated
a collection
3 days ago
Daily Papers
Organizations
None yet
LLM Reasoning
-
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Paper • 2402.07754 • Published -
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Paper • 2505.10446 • Published -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 83 -
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
Paper • 2505.16782 • Published
Daily Papers
-
Robust Multimodal Large Language Models Against Modality Conflict
Paper • 2507.07151 • Published • 5 -
One Token to Fool LLM-as-a-Judge
Paper • 2507.08794 • Published • 30 -
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 96 -
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper • 2507.08799 • Published • 38
A Survey on LLM-as-a-Judge
-
Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges
Paper • 2406.12624 • Published • 38 -
A Survey on LLM-as-a-Judge
Paper • 2411.15594 • Published -
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
Paper • 2412.05579 • Published • 1 -
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Paper • 2411.16594 • Published • 41
Daily Papers
-
Robust Multimodal Large Language Models Against Modality Conflict
Paper • 2507.07151 • Published • 5 -
One Token to Fool LLM-as-a-Judge
Paper • 2507.08794 • Published • 30 -
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 96 -
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper • 2507.08799 • Published • 38
LLM Reasoning
-
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
Paper • 2402.07754 • Published -
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Paper • 2505.10446 • Published -
A Survey on Latent Reasoning
Paper • 2507.06203 • Published • 83 -
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
Paper • 2505.16782 • Published