Nicolas-BZRD's picture

Nicolas-BZRD

Nicolas-BZRD

·

https://nicolas-bzrd.github.io

AI & ML interests

PhD Student | NLP - LLMs - Adaptation real-world problem Optimization

Recent Activity

upvoted an article 5 days ago

Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications

liked a dataset 21 days ago

nvidia/Llama-Nemotron-VLM-Dataset-v1

View all activity

Organizations

upvoted an article 5 days ago

Article

Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications

By

and 5 others •

10 days ago

• 26

upvoted an article 28 days ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

28 days ago

• 72

upvoted an article about 1 month ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

By

and 11 others •

Aug 5

• 490

upvoted a collection about 1 month ago

SauerkrautLM-Multilingual-(Reason)-ColBERT

SauerkrautLM ColBERT is a suite of Late-Interaction retrieval models built with PyLate’s ColBERT architecture and tuned for seven European languages. • 7 items • Updated Aug 3 • 18

upvoted 2 articles about 1 month ago

Article

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

By

and 6 others •

Jun 12

• 133

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

By

and 4 others •

Jul 29

• 170

upvoted 3 collections about 2 months ago

GLiCLass-V3

Models for zero-shot text classification that are up to 50 times faster than Cross-Encoders and show the same or higher accuracy. • 8 items • Updated 26 days ago • 15

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 62

Reward Models

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 6 days ago • 21

upvoted an article about 2 months ago

Article

Introducing ColQwen-Omni: Retrieve in every modality

By

and 4 others •

Jul 17

• 69

upvoted a paper about 2 months ago

Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2 • 42

upvoted 2 articles 2 months ago

Article

FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages

By

and 5 others •

Jul 8

• 30

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

Jul 8

• 647

upvoted a paper 2 months ago

Should We Still Pretrain Encoders with Masked Language Modeling?

Paper • 2507.00994 • Published Jul 1 • 78

upvoted an article 2 months ago

Article

Should We Still Pretrain Encoders with Masked Language Modeling?

By

and 3 others •

Jul 2

• 21

upvoted a collection 3 months ago

MaLA corpus

MaLA Corpus for Massive Language Adaptation of Large Language Models https://mala-lm.github.io • 18 items • Updated Jun 9 • 7

upvoted an article 4 months ago

Article

🥬 LettuceDetect Goes Multilingual: Fine-tuning EuroBERT on Synthetic Translations

By

and 1 other •

May 19

• 9

upvoted 2 collections 4 months ago

Multilingual Hallucination Detection

These are our EuroBERT fine-tunes on our translated RAGTruth datasets. • 13 items • Updated May 18 • 5

Qwen3

84 items • Updated Aug 6 • 1.21k

upvoted a collection 5 months ago

Llama Nemotron

Open, Production-ready Enterprise Models • 11 items • Updated 6 days ago • 68