AI & ML interests
computational linguistics, natural language processing
Recent Activity
Datasets used for the OLMo experiments in the "Not All Data are Unlearned Equally" paper https://arxiv.org/abs/2504.05058
Generate challenging synthetic data to evaluate LLMs
https://mcgill-nlp.github.io/weblinx
https://mcgill-nlp.github.io/weblinx
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
-
McGill-NLP/AfroXLMR-large-76L-Injongo-intent
Text Classification • 0.6B • Updated • 5 -
McGill-NLP/AfroXLMR-large-76L-Injongo-slot
Token Classification • 0.6B • Updated • 5 -
McGill-NLP/gemma-2-9b-it-Injongo-intent
Text Generation • 9B • Updated • 3 -
McGill-NLP/gemma-2-9b-it-Injongo-slot
Text Generation • 9B • Updated • 3
-
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Paper • 2504.08942 • Published • 27 -
McGill-NLP/agent-reward-bench
Viewer • Updated • 1.41k • 4.08k • 4 -
4
Agent Reward Bench Demo
💻Visualize agent interactions with WebArena tasks
-
1
Agent Reward Bench Leaderboard
🥇Leaderboard for AgentRewardBench
-
McGill-NLP/LLM2Vec-Meta-Llama-31-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 34 • 4 -
McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 8.03k • 49 -
McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised
Sentence Similarity • Updated • 284 • 13 -
McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-supervised
Sentence Similarity • Updated • 21 • 3
Repository: https://github.com/McGill-NLP/AURORA
mcgill-nlp.github.io/statcan-dialogue-dataset
-
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Paper • 2304.01412 • Published • 2 -
McGill-NLP/statcan-dialogue-dataset
Preview • Updated • 13 • 7 -
McGill-NLP/dpr-statcan-conversation_encoder-title
Feature Extraction • 0.1B • Updated • 6 -
McGill-NLP/tapas-statcan-large-conversation_encoder-cell_tokens
Feature Extraction • Updated • 1
-
Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval
Paper • 2104.08801 • Published • 1 -
McGill-NLP/mlquestions
Updated • 91 • 2 -
McGill-NLP/bart-qg-mlquestions-backtraining
Updated • 15 -
McGill-NLP/bart-qg-mlquestions-selftraining
Updated • 15
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
-
McGill-NLP/AfroXLMR-large-76L-Injongo-intent
Text Classification • 0.6B • Updated • 5 -
McGill-NLP/AfroXLMR-large-76L-Injongo-slot
Token Classification • 0.6B • Updated • 5 -
McGill-NLP/gemma-2-9b-it-Injongo-intent
Text Generation • 9B • Updated • 3 -
McGill-NLP/gemma-2-9b-it-Injongo-slot
Text Generation • 9B • Updated • 3
Datasets used for the OLMo experiments in the "Not All Data are Unlearned Equally" paper https://arxiv.org/abs/2504.05058
-
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Paper • 2504.08942 • Published • 27 -
McGill-NLP/agent-reward-bench
Viewer • Updated • 1.41k • 4.08k • 4 -
4
Agent Reward Bench Demo
💻Visualize agent interactions with WebArena tasks
-
1
Agent Reward Bench Leaderboard
🥇Leaderboard for AgentRewardBench
Generate challenging synthetic data to evaluate LLMs
-
McGill-NLP/LLM2Vec-Meta-Llama-31-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 34 • 4 -
McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
Sentence Similarity • Updated • 8.03k • 49 -
McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised
Sentence Similarity • Updated • 284 • 13 -
McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-supervised
Sentence Similarity • Updated • 21 • 3
https://mcgill-nlp.github.io/weblinx
Repository: https://github.com/McGill-NLP/AURORA
https://mcgill-nlp.github.io/weblinx
mcgill-nlp.github.io/statcan-dialogue-dataset
-
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Paper • 2304.01412 • Published • 2 -
McGill-NLP/statcan-dialogue-dataset
Preview • Updated • 13 • 7 -
McGill-NLP/dpr-statcan-conversation_encoder-title
Feature Extraction • 0.1B • Updated • 6 -
McGill-NLP/tapas-statcan-large-conversation_encoder-cell_tokens
Feature Extraction • Updated • 1
-
Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval
Paper • 2104.08801 • Published • 1 -
McGill-NLP/mlquestions
Updated • 91 • 2 -
McGill-NLP/bart-qg-mlquestions-backtraining
Updated • 15 -
McGill-NLP/bart-qg-mlquestions-selftraining
Updated • 15