317 480 1250

Prithiv Sakthi

prithivMLmods

https://prithivsakthi.vercel.app/

AI & ML interests

computer vision, nlp, multimodality @strangerzonehf @strangerguardhf

Recent Activity

replied to their post about 3 hours ago

Dropped the HeadshotX : a super-realistic headshot adapter for https://huggingface.co/Qwen/Qwen-Image, an image generation model by Qwen. It is an advanced LoRA adaptation of the Qwen-Image model and an upgraded version of https://huggingface.co/prithivMLmods/Qwen-Image-Studio-Realism, offering more precise portrait rendering with a strong focus on realism. The model was trained on diverse face types from across the world, labeled with `florence2-en` and caption-optimized using https://huggingface.co/prithivMLmods/DeepCaption-VLA-7B. 11(types) × 5 different face types: Asian, Hispanic, Caucasian, Latina, Middle Eastern, etc. ⮞ Model🤗: https://huggingface.co/prithivMLmods/Qwen-Image-HeadshotX ⮞ The Previous Adapter (LoRA): https://huggingface.co/prithivMLmods/Qwen-Image-Studio-Realism ⮞ Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-exp-lora-68a978fe11400bc3165b0c4d . . . To know more about it, visit the app page or the respective model page!!

replied to their post about 7 hours ago

OpenGVLab's InternVL3_5-2B-MPO [Mixed Preference Optimization (MPO)] is a compact vision-language model in the InternVL3.5 series. You can now experience it in the Tiny VLMs Lab, an app featuring 15+ multimodal VLMs ranging from 250M to 4B parameters. These models support tasks such as OCR, reasoning, single-shot answering with small models, and captioning (including ablated variants), across a broad range of visual categories. They are also capable of handling images with complex, sensitive, or nuanced content, while adapting to varying aspect ratios and resolutions. ✨ Space/App : https://huggingface.co/spaces/prithivMLmods/Tiny-VLMs-Lab 🫙 Model : https://huggingface.co/OpenGVLab/InternVL3_5-2B-MPO ↗️ Collection: https://huggingface.co/collections/OpenGVLab/internvl35-68ac87bd52ebe953485927fb 🗞️ Paper : https://arxiv.org/pdf/2508.18265 ↗️ Multimodal Space Collection : https://huggingface.co/collections/prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0 To learn more, visit the relevant spaces, collections, and model cards.

updated a collection about 7 hours ago

DeepCaption attr.

View all activity

Organizations

replied to their post about 3 hours ago

Glad you loved that! @jameshuntercarter
Qwen Image Edit is already getting better at realistic headshots and realism style mixing! Considering the crafting of custom illustrations or style mixing would be fun to play around with using the model! And since you asked, I'll definitely train it, and you can expect results in the coming days.!

replied to their post about 7 hours ago

@salimalsazu
Glad you loved that! 🤗

updated a collection about 7 hours ago

DeepCaption attr.

Collection

Vision Language Attribution • 4 items • Updated about 7 hours ago • 2

updated a Space about 7 hours ago

Tiny VLMs Lab

🧪

Experiment with the Tiny VLMs here

upvoted 3 papers about 7 hours ago

Delta Activations: A Representation for Finetuned Large Language Models

Paper • 2509.04442 • Published 3 days ago • 3

Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers

Paper • 2509.03059 • Published 4 days ago • 15

DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks

Paper • 2509.01396 • Published 6 days ago • 45

reacted to their post with 🚀 about 7 hours ago

Post

368

Dropped the HeadshotX : a super-realistic headshot adapter for Qwen/Qwen-Image, an image generation model by Qwen. It is an advanced LoRA adaptation of the Qwen-Image model and an upgraded version of prithivMLmods/Qwen-Image-Studio-Realism, offering more precise portrait rendering with a strong focus on realism. The model was trained on diverse face types from across the world, labeled with florence2-en and caption-optimized using prithivMLmods/DeepCaption-VLA-7B. 11(types) × 5 different face types: Asian, Hispanic, Caucasian, Latina, Middle Eastern, etc.

⮞ Model🤗: prithivMLmods/Qwen-Image-HeadshotX

⮞ The Previous Adapter (LoRA): prithivMLmods/Qwen-Image-Studio-Realism

⮞ Collection: prithivMLmods/qwen-image-exp-lora-68a978fe11400bc3165b0c4d

.
.
.
To know more about it, visit the app page or the respective model page!!