3 6

Dongchan Shin

ShinDC

AI & ML interests

NLP

Recent Activity

upvoted a paper 1 day ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

upvoted a paper 5 days ago

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

upvoted a paper 5 days ago

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

View all activity

Organizations

ShinDC's activity

upvoted a paper 1 day ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published 5 days ago • 19

upvoted 2 papers 5 days ago

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11, 2024 • 50

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Paper • 2411.07763 • Published Nov 12, 2024 • 1

authored 4 papers 5 days ago

OpenAgents: An Open Platform for Language Agents in the Wild

Paper • 2310.10634 • Published Oct 16, 2023 • 9

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11, 2024 • 50

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Paper • 2411.07763 • Published Nov 12, 2024 • 1

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 14 days ago • 72

upvoted a paper 5 days ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 14 days ago • 72

upvoted a paper about 1 month ago

SafeArena: Evaluating the Safety of Autonomous Web Agents

Paper • 2503.04957 • Published Mar 6 • 19

updated 2 models 5 months ago

ShinDC/llama_finetune_mind2web_1B

Updated Nov 17, 2024 • 2

ShinDC/llama_finetune_mind2web

Updated Nov 9, 2024 • 1

updated 2 models over 1 year ago

ShinDC/distilbert-base-cased-finetuned-imdb

Fill-Mask • Updated Nov 1, 2023 • 11

ShinDC/marian-finetuned-kde4-en-to-fr

Translation • Updated Nov 1, 2023 • 6

updated a dataset over 1 year ago

ShinDC/important_dataset

Viewer • Updated Nov 1, 2023 • 16.8M • 72

updated a model over 1 year ago

ShinDC/bert-finetuned-ner

Token Classification • Updated Nov 1, 2023 • 4

upvoted a paper over 1 year ago

OpenAgents: An Open Platform for Language Agents in the Wild

Paper • 2310.10634 • Published Oct 16, 2023 • 9

updated 4 models over 1 year ago