3 4 1

Wenhao Huang

EZ-hwh

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

upvoted a paper about 2 months ago

Chain-of-Model Learning for Language Model

authored a paper 2 months ago

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

View all activity

Organizations

upvoted 2 papers about 2 months ago

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26 • 44

Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17 • 120

authored a paper 2 months ago

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

Paper • 2505.02735 • Published May 5 • 32

authored 2 papers 3 months ago

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Paper • 2504.15415 • Published Apr 21 • 22

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

authored a paper 4 months ago

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11 • 68

authored a paper 5 months ago

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models

Paper • 2502.16614 • Published Feb 23 • 27

upvoted a paper 5 months ago

AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19, 2024 • 44

upvoted a collection 7 months ago

UI Agent

Collection

a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 390 items • Updated 3 days ago • 60

authored a paper 9 months ago

PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment

Paper • 2410.13785 • Published Oct 17, 2024 • 19

liked a dataset 11 months ago

tiiuae/falcon-refinedweb

Viewer • Updated Jun 20, 2023 • 968M • 22.6k • 862

authored a paper about 1 year ago