Sigrid Jin

sigridjineth

·

https://sigridjin.medium.com

AI & ML interests

UBC Computer Science / Instruct.KR

Recent Activity

liked a model 10 days ago

RedHatAI/GLM-5.2-speculator.dspark

liked a model 3 months ago

BidirLM/BidirLM-Omni-2.5B-Embedding

commentedon an article 7 months ago

We Got Claude to Fine-Tune an Open Source LLM

View all activity

Organizations

upvoted an article 7 months ago

Article

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

sionic-ai

•

Dec 8, 2025

• 60

upvoted an article 10 months ago

Article

PP-OCRv5 on Hugging Face: A Specialized Approach to OCR

baidu

•

Sep 10, 2025

• 112

upvoted a collection 10 months ago

Inference Free Splade Models

The collection includes Inference Free Splade models that can be load thanks to the Sparse Encoder modules of Sentence Transformers • 6 items • Updated Jun 30, 2025 • 5

upvoted a paper 10 months ago

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 74

upvoted a collection about 1 year ago

T5Gemma

32 items • Updated Mar 12 • 86

upvoted an article about 1 year ago

Article

Training and Finetuning Sparse Embedding Models with Sentence Transformers

tomaarsen, arthurbresnu

•

Jul 1, 2025

• 138

upvoted a collection about 1 year ago

Korean Embedding Models

A collection of high-performance Korean embedding models, including both models I trained myself and other publicly available strong baselines. • 7 items • Updated 11 days ago • 2

upvoted an article about 1 year ago

Article

Context Is Gold to Find the Gold Passage: Evaluating and Training Contextual Document Embeddings

manu

•

Jun 2, 2025

• 28

upvoted 5 collections about 1 year ago

NanoBEIR 🍺

A collection of smaller versions of BEIR datasets with 50 queries and up to 10K documents each. • 13 items • Updated Sep 11, 2024 • 27

VLM2Vec

The VLM2Vec embedding models. • 11 items • Updated Jul 8, 2025 • 8

VoRA

Everything for the paper "Vision as LoRA". • 8 items • Updated Mar 2 • 7

💜 Kotlin ML Pack

A collection of datasets, fine-tuned models and benchmarks to train your models for perfect Kotlin code generation. • 9 items • Updated Jun 11, 2024 • 27

Mellum

Series of code models by JetBrains • 12 items • Updated Oct 1, 2025 • 50

upvoted a paper about 1 year ago

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29, 2025 • 54

upvoted 2 papers over 1 year ago

Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published Mar 10, 2025 • 48

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published Mar 2, 2025 • 64

upvoted a collection over 1 year ago

GemmaX2

GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated Feb 7, 2025 • 24

upvoted a paper over 1 year ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 457

upvoted an article over 1 year ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

NormalUhr

•

Feb 7, 2025

• 297

upvoted a paper over 1 year ago

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Paper • 2402.07440 • Published Feb 12, 2024 • 1