15 18 3

Taiwei Shi

MaksimSTW

https://maksimstw.github.io/

AI & ML interests

alignment, human-AI collaboration, and computational social science

Recent Activity

updated a model 5 days ago

lime-nlp/Llama-3.1-8B-Instruct-SUM50

updated a model 5 days ago

lime-nlp/Llama-3.1-8B-Instruct-SUM30

updated a model 5 days ago

lime-nlp/Llama-3.1-8B-Instruct-SUM10

View all activity

Organizations

MaksimSTW's activity

upvoted a collection 6 days ago

Synthetic Unanswerable Math (SUM)

Collection

A collection of models and dataset from the paper "The Hallucination Tax of Reinforcement Finetuning". • 22 items • Updated 3 days ago • 1

upvoted a paper 21 days ago

The Hallucination Tax of Reinforcement Finetuning

Paper • 2505.13988 • Published 22 days ago • 8

upvoted a paper about 2 months ago

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

Paper • 2504.10766 • Published Apr 14 • 40

upvoted a collection about 2 months ago

Papers from LIME Lab

Collection

Papers from LIME Lab • 8 items • Updated 21 days ago • 2

upvoted 4 papers 2 months ago

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

Paper • 2504.07964 • Published Apr 10 • 61

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

Paper • 2504.06514 • Published Apr 9 • 39

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published Apr 7 • 10

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

Paper • 2503.23361 • Published Mar 30 • 6

upvoted a collection 2 months ago

Difficulty Estimation Math Datasets

Collection

We perform difficulty estimation on popular math datasets. • 5 items • Updated Apr 9 • 1

upvoted 2 papers 4 months ago

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Paper • 2502.14296 • Published Feb 20 • 46

Detecting and Filtering Unsafe Training Data via Data Attribution

Paper • 2502.11411 • Published Feb 17 • 1

upvoted a paper 5 months ago

WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback

Paper • 2408.15549 • Published Aug 28, 2024 • 1

upvoted 3 papers 7 months ago

How Susceptible are Large Language Models to Ideological Manipulation?

Paper • 2402.11725 • Published Feb 18, 2024 • 1

Can Language Model Moderators Improve the Health of Online Discourse?

Paper • 2311.10781 • Published Nov 16, 2023 • 1

CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

Paper • 2310.15638 • Published Oct 24, 2023 • 1

upvoted a paper 8 months ago

CLIMB: A Benchmark of Clinical Bias in Large Language Models

Paper • 2407.05250 • Published Jul 7, 2024 • 2

upvoted 2 papers over 1 year ago

Safer-Instruct: Aligning Language Models with Automated Preference Data

Paper • 2311.08685 • Published Nov 15, 2023 • 1

TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 70