Wenkai Yang's picture

2 7

Wenkai Yang PRO

Keven16

·

https://keven980716.github.io/

keven980716

AI & ML interests

None yet

Recent Activity

published a model 4 days ago

Keven16/Qwen2.5-32B-TOPS-Iter-DPO-Preview

published a model 4 days ago

Keven16/Qwen2.5-32B-TOPS-Iter-DPO

upvoted a paper 5 days ago

Agentic Reinforced Policy Optimization

View all activity

Organizations

None yet

authored 5 papers 3 months ago

Towards Codable Watermarking for Injecting Multi-bits Information to LLMs

Paper • 2307.15992 • Published Jul 29, 2023 • 1

Well-classified Examples are Underestimated in Classification with Deep Neural Networks

Paper • 2110.06537 • Published Oct 13, 2021

Exploring Backdoor Vulnerabilities of Chat Models

Paper • 2404.02406 • Published Apr 3, 2024

Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning

Paper • 2502.18080 • Published Feb 25 • 2

DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1 • 54

authored 5 papers about 1 year ago

Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization

Paper • 2406.11431 • Published Jun 17, 2024 • 4

Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents

Paper • 2402.11208 • Published Feb 17, 2024

Enabling Large Language Models to Learn from Rules

Paper • 2311.08883 • Published Nov 15, 2023

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

Paper • 2110.07831 • Published Oct 15, 2021

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

Paper • 2103.15543 • Published Mar 29, 2021