GEM benchmark
AI & ML interests
We develop infrastructure for the evaluation of generated text.
Recent Activity
View all activity
GEM's activity
prithivMLmods
posted
an
update
about 14 hours ago
Delta-Vector
posted
an
update
3 days ago
Post
533
For anyone that enjoys Magnum models, I just dropped a 12B that is the first (or second?) stepping stone into Magnum V5
Delta-Vector/rei-12b-6795505005c4a94ebdfdeb39
Delta-Vector/rei-12b-6795505005c4a94ebdfdeb39
Post
9283
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!
🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.
🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.
🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.
Follow along: https://github.com/huggingface/open-r1
🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.
🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.
🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.
Follow along: https://github.com/huggingface/open-r1
prithivMLmods
posted
an
update
9 days ago
Post
3358
Q'n' Sketches ❤️🔥
🖼️ Adapters:
- Qs : strangerzonehf/Qs-Sketch
- Qd : strangerzonehf/Qd-Sketch
- Qx : strangerzonehf/Qx-Art
- Qc : strangerzonehf/Qc-Sketch
- Bb : strangerzonehf/Bg-Bag
🐍 Collection : strangerzonehf/q-series-sketch-678e3503bf3a661758429717
🔗Page : https://huggingface.co/strangerzonehf
.
.
.
@prithivMLmods 🤗
🖼️ Adapters:
- Qs : strangerzonehf/Qs-Sketch
- Qd : strangerzonehf/Qd-Sketch
- Qx : strangerzonehf/Qx-Art
- Qc : strangerzonehf/Qc-Sketch
- Bb : strangerzonehf/Bg-Bag
🐍 Collection : strangerzonehf/q-series-sketch-678e3503bf3a661758429717
🔗Page : https://huggingface.co/strangerzonehf
.
.
.
@prithivMLmods 🤗
prithivMLmods
posted
an
update
13 days ago
Post
3061
ChemQwen-vL [ Qwen for Chem Vision ] 🧑🏻🔬
🧪Model : prithivMLmods/ChemQwen-vL
📝ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/
📒Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju
Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/
🤗: @prithivMLmods
🧪Model : prithivMLmods/ChemQwen-vL
📝ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/
📒Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju
Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/
🤗: @prithivMLmods
Post
2150
🤗👤 💻 Speaking of AI agents ...
...Is easier with the right words ;)
My colleagues @meg @evijit @sasha and @giadap just published a wonderful blog post outlining some of the main relevant notions with their signature blend of value-informed and risk-benefits contrasting approach. Go have a read!
https://huggingface.co/blog/ethics-soc-7
...Is easier with the right words ;)
My colleagues @meg @evijit @sasha and @giadap just published a wonderful blog post outlining some of the main relevant notions with their signature blend of value-informed and risk-benefits contrasting approach. Go have a read!
https://huggingface.co/blog/ethics-soc-7
Sri-Vigneshwar-DJ
posted
an
update
20 days ago
Post
647
Checkout phi-4 from Microsoft, dropped a day ago... If you ❤️ the Phi series, then here is the GGUF -
Sri-Vigneshwar-DJ/phi-4-GGUF. phi-4 is a 14B highly efficient open LLM that beats much larger models at math and reasoning - check out evaluations on the Open LLM.
Technical paper - https://arxiv.org/pdf/2412.08905 ; The Data Synthesis approach is interesting
Technical paper - https://arxiv.org/pdf/2412.08905 ; The Data Synthesis approach is interesting
prithivMLmods
posted
an
update
20 days ago
Post
3355
200+ f{🤗} on Stranger Zone! [ https://huggingface.co/strangerzonehf ]
❤️🔥Stranger Zone's MidJourney Mix Model Adapter is trending on the Very Model Page, with over 45,000+ downloads. Additionally, the Super Realism Model Adapter has over 52,000+ downloads, remains the top two adapter on Stranger Zone!
strangerzonehf/Flux-Midjourney-Mix2-LoRA, strangerzonehf/Flux-Super-Realism-LoRA
👽Try Demo: prithivMLmods/FLUX-LoRA-DLC
📦Most Recent Adapters to Check Out :
+ Ctoon : strangerzonehf/Ctoon-Plus-Plus
+ Cardboard : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Claude Art : strangerzonehf/Flux-Claude-Art
+ Flay Lay : strangerzonehf/Flux-FlatLay-LoRA
+ Smiley Portrait : strangerzonehf/Flux-Smiley-Portrait-LoRA
🤗Thanks for Community & OPEN SOURCEEE !!
❤️🔥Stranger Zone's MidJourney Mix Model Adapter is trending on the Very Model Page, with over 45,000+ downloads. Additionally, the Super Realism Model Adapter has over 52,000+ downloads, remains the top two adapter on Stranger Zone!
strangerzonehf/Flux-Midjourney-Mix2-LoRA, strangerzonehf/Flux-Super-Realism-LoRA
👽Try Demo: prithivMLmods/FLUX-LoRA-DLC
📦Most Recent Adapters to Check Out :
+ Ctoon : strangerzonehf/Ctoon-Plus-Plus
+ Cardboard : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Claude Art : strangerzonehf/Flux-Claude-Art
+ Flay Lay : strangerzonehf/Flux-FlatLay-LoRA
+ Smiley Portrait : strangerzonehf/Flux-Smiley-Portrait-LoRA
🤗Thanks for Community & OPEN SOURCEEE !!
albertvillanova
posted
an
update
23 days ago
Post
1922
Discover all the improvements in the new version of Lighteval: https://huggingface.co/docs/lighteval/
Sri-Vigneshwar-DJ
posted
an
update
23 days ago
Post
2061
Just sharing a thought: I started using DeepSeek V3 a lot, and an idea struck me about agents "orchestrating during inference" on a test-time compute model like DeepSeek V3 or the O1 series.
Agents (Instruction + Function Calls + Memory) execute during inference, and based on the output decision, a decision is made to scale the time to reason or perform other tasks.
Agents (Instruction + Function Calls + Memory) execute during inference, and based on the output decision, a decision is made to scale the time to reason or perform other tasks.
prithivMLmods
posted
an
update
24 days ago
Post
5904
Reasoning SmolLM2 🚀
🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.
🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft
🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF
🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M
🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.
🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft
🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF
🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M
Post
3652
I was initially pretty sceptical about Meta's Coconut paper [1] because the largest perf gains were reported on toy linguistic problems. However, these results on machine translation are pretty impressive!
https://x.com/casper_hansen_/status/1875872309996855343
Together with the recent PRIME method [2] for scaling RL, reasoning for open models is looking pretty exciting for 2025!
[1] Training Large Language Models to Reason in a Continuous Latent Space (2412.06769)
[2] https://huggingface.co/blog/ganqu/prime
https://x.com/casper_hansen_/status/1875872309996855343
Together with the recent PRIME method [2] for scaling RL, reasoning for open models is looking pretty exciting for 2025!
[1] Training Large Language Models to Reason in a Continuous Latent Space (2412.06769)
[2] https://huggingface.co/blog/ganqu/prime
Sri-Vigneshwar-DJ
posted
an
update
25 days ago
Post
2340
Combining smolagents with Anthropic’s best practices simplifies building powerful AI agents:
1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.
https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.
https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
prithivMLmods
posted
an
update
29 days ago
Post
3863
Triangulum Catalogued 🔥💫
🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF
+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF
+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF
+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF
+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
Krystalan
authored
a
paper
about 1 month ago
Post
2251
This paper (
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs (2412.18925)) has a really interesting recipe for inducing o1-like behaviour in Llama models:
* Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting.
* Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases)
* Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1
* Use the resulting data for SFT & RL
* Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement.
Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!
* Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting.
* Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases)
* Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1
* Use the resulting data for SFT & RL
* Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement.
Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!
prithivMLmods
posted
an
update
about 1 month ago
fladhak
authored
a
paper
about 1 month ago
prithivMLmods
posted
an
update
about 1 month ago
Post
2545
Qwen2VL Models: Vision and Language Processing 🍉
📍FT; [ Latex OCR, Math Parsing, Text Analogy OCRTest ]
Colab Demo: prithivMLmods/Qwen2-VL-OCR-2B-Instruct
❄️Demo : https://huggingface.co/spaces/prithivMLmods/Qwen2-VL-2B . The demo includes the Qwen2VL 2B Base Model.
🎯The space handles documenting content from the input image along with standardized plain text. It includes adjustment tools with over 30 font styles, file formatting support for PDF and DOCX, textual alignments, font size adjustments, and line spacing modifications.
📄PDFs are rendered using the ReportLab software library toolkit.
🧵Models :
+ prithivMLmods/Qwen2-VL-OCR-2B-Instruct
+ prithivMLmods/Qwen2-VL-Ocrtest-2B-Instruct
+ prithivMLmods/Qwen2-VL-Math-Prase-2B-Instruct
🚀Sample Document :
+ https://drive.google.com/file/d/1Hfqqzq4Xc-3eTjbz-jcQY84V5E1YM71E/view?usp=sharing
📦Collection :
+ prithivMLmods/vision-language-models-67639f790e806e1f9799979f
.
.
.
@prithivMLmods 🤗
📍FT; [ Latex OCR, Math Parsing, Text Analogy OCRTest ]
Colab Demo: prithivMLmods/Qwen2-VL-OCR-2B-Instruct
❄️Demo : https://huggingface.co/spaces/prithivMLmods/Qwen2-VL-2B . The demo includes the Qwen2VL 2B Base Model.
🎯The space handles documenting content from the input image along with standardized plain text. It includes adjustment tools with over 30 font styles, file formatting support for PDF and DOCX, textual alignments, font size adjustments, and line spacing modifications.
📄PDFs are rendered using the ReportLab software library toolkit.
🧵Models :
+ prithivMLmods/Qwen2-VL-OCR-2B-Instruct
+ prithivMLmods/Qwen2-VL-Ocrtest-2B-Instruct
+ prithivMLmods/Qwen2-VL-Math-Prase-2B-Instruct
🚀Sample Document :
+ https://drive.google.com/file/d/1Hfqqzq4Xc-3eTjbz-jcQY84V5E1YM71E/view?usp=sharing
📦Collection :
+ prithivMLmods/vision-language-models-67639f790e806e1f9799979f
.
.
.
@prithivMLmods 🤗
prithivMLmods
posted
an
update
about 1 month ago
Post
3300
🎄 Here Before - Xmas🎅✨
🧑🏻🎄Models
+ [ Xmas 2D Illustration ] : strangerzonehf/Flux-Xmas-Illustration-LoRA
+ [ Xmas 3D Art ] : strangerzonehf/Flux-Xmas-3D-LoRA
+ [ Xmas Chocolate ] : strangerzonehf/Flux-Xmas-Chocolate-LoRA
+ [ Xmas Isometric Kit ] : strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA
+ [ Xmas Realpix ] : strangerzonehf/Flux-Xmas-Realpix-LoRA
+ [ Xmas Anime ] : strangerzonehf/Flux-Anime-Xmas-LoRA
❄️Collections
+ [ Xmas Art ] : strangerzonehf/christmas-pack-6758b199487adafaddb68f82
+ [ Stranger Zone Collection ] : prithivMLmods/stranger-zone-collections-org-6737118adcf2cb40d66d0c7e
🥶Page
+ [ Stranger Zone ] : https://huggingface.co/strangerzonehf
.
.
.
@prithivMLmods 🤗
🧑🏻🎄Models
+ [ Xmas 2D Illustration ] : strangerzonehf/Flux-Xmas-Illustration-LoRA
+ [ Xmas 3D Art ] : strangerzonehf/Flux-Xmas-3D-LoRA
+ [ Xmas Chocolate ] : strangerzonehf/Flux-Xmas-Chocolate-LoRA
+ [ Xmas Isometric Kit ] : strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA
+ [ Xmas Realpix ] : strangerzonehf/Flux-Xmas-Realpix-LoRA
+ [ Xmas Anime ] : strangerzonehf/Flux-Anime-Xmas-LoRA
❄️Collections
+ [ Xmas Art ] : strangerzonehf/christmas-pack-6758b199487adafaddb68f82
+ [ Stranger Zone Collection ] : prithivMLmods/stranger-zone-collections-org-6737118adcf2cb40d66d0c7e
🥶Page
+ [ Stranger Zone ] : https://huggingface.co/strangerzonehf
.
.
.
@prithivMLmods 🤗