SomosNLP

non-profit

https://somosnlp.org/

SomosNLP_

somosnlp

Activity Feed

AI & ML interests

Democratizar el PLN en español e incentivar su aplicación para generar impacto social 💛

Recent Activity

mariagrandury updated a Space 29 days ago

somosnlp/leaderboard-hackaton-2025

haritzpuerto authored a paper about 2 months ago

Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers

haritzpuerto authored a paper about 2 months ago

C-SEO Bench: Does Conversational SEO Work?

View all activity

mariagrandury

updated a Space 29 days ago

Leaderboard Retos Hackathon SomosNLP 2025

🏆

Leaderboard Retos Hackathon SomosNLP 2025

blaise-tk

posted an update 2 months ago

Post

3223

A few months ago, I shared that I was building with @deeivihh something like "the Steam for open source apps"...

🚀 Today, I’m excited to announce that Dione is now open source and live in public beta!

Our mission is simple: make it easier to discover, use, and contribute to open source applications.

🔗 GitHub: https://github.com/dioneapp/dioneapp
💬 Join the community: https://discord.gg/JDFJp33vrM

Want to give it a try? I’d love your feedback! 👀

mariagrandury

published a dataset 3 months ago

somosnlp/babylm-es

Updated Jun 19 • 1

dvilasuero

posted an update 3 months ago

Post

2865

Super excited to launch Hugging Face Sheets: Spreadsheets meet AI and unstructured data.

A few months ago, we started imagining new ways to build and transform datasets with the latest open-source models.

Today, I'm thrilled to introduce our first step in this direction.

In a nutshell:

📁 Effortlessly run prompts and models over your data.
🌐 Agentic search for accuracy and real-time information.
🖼️ Familiar, minimalistic interface for interacting with data.
🎯 Human feedback 2.0: Your input directly improves generated data.
💯 Access hundreds of open models and leading inference providers.

Go to this space to try it out!

aisheets/sheets

Leave your questions below, we're just getting started!

3 replies

dianags

authored 2 papers 3 months ago

Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset

Paper • 2503.23899 • Published Mar 31

Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction

Paper • 2303.14342 • Published Mar 25, 2023

mariagrandury

authored 2 papers 4 months ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published Apr 9 • 9

It's the same but not the same: Do LLMs distinguish Spanish varieties?

Paper • 2504.20049 • Published Apr 8

reddrex

in somosnlp/LingComp_QA 4 months ago

How use the dataset to train my model GPT

#1 opened 4 months ago by

luisaarias

lewtun

authored a paper 4 months ago

Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning

Paper • 2504.11354 • Published Apr 15 • 6

blaise-tk

posted an update 4 months ago

Post

4505

Today we launch Dione.

A few months ago it was just a wild idea I shared with @bygimenez , now it's real.

Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones.

Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that.

Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source.
Users can even write and share their own installation scripts. This is just the beginning.

🚀 Join our exclusive Beta
→ https://getdione.app/beta/join

2 replies

ouhenio

updated a Space 5 months ago

Mapa Blend-es

🌍

Revisa el avance colectivo de blend-es 😊

lewtun

authored a paper 5 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 199

lewtun

authored a paper 6 months ago

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 47

lewtun

posted an update 6 months ago

Post

3411

Introducing OlympicCoder: a series of open reasoning models that can solve olympiad-level programming problems 🧑‍💻

- 7B open-r1/OlympicCoder-7B
- 32B open-r1/OlympicCoder-32B

We find that OlympicCoder models outperform Claude 3.7 Sonnet, as well as others over 100x larger 💪

Together with the models, we are releasing:

📊CodeForces-CoTs: new dataset of code problems from the most popular competitive coding platform, with R1 traces in C++ and Python open-r1/codeforces-cots

🏆 IOI'2024: a new benchmark of VERY hard programming problems where even frontier models struggle to match human performance open-r1/ioi

For links to the models and datasets, check out our latest progress report from Open R1: https://huggingface.co/blog/open-r1/update-3

1 reply

lewtun

posted an update 7 months ago

Post

5442

Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2