Musika

community

AI & ML interests

Fast Waveform Music Generation

Recent Activity

musika's activity

prithivMLmods 
posted an update 8 days ago
view post
Post
2112
Got access to Google's all-new Gemini Diffusion a state-of-the-art text diffusion model. It delivers the performance of Gemini 2.0 Flash-Lite at 5x the speed, generating over 1000 tokens in a fraction of a second and producing impressive results. Below are some initial outputs generated using the model. ♊🔥

Gemini Diffusion Playground ✦ : https://deepmind.google.com/frontiers/gemini-diffusion

Get Access Here : https://docs.google.com/forms/d/1aLm6J13tAkq4v4qwGR3z35W2qWy7mHiiA0wGEpecooo/viewform?edit_requested=true

🔗 To know more, visit: https://deepmind.google/models/gemini-diffusion/
  • 1 reply
·
prithivMLmods 
posted an update 9 days ago
view post
Post
2195
The more optimized explicit content filters with lightweight 𝙜𝙪𝙖𝙧𝙙 models trained based on siglip2 patch16 512 and vit patch16 224 for illustration and explicit content classification for content moderation in social media, forums, and parental controls for safer browsing environments. this version fixes the issues in the previous release, which lacked sufficient resources. 🚀

⤷ Models :
→ siglip2 mini explicit content : prithivMLmods/siglip2-mini-explicit-content [recommended]
→ vit mini explicit content : prithivMLmods/vit-mini-explicit-content

⤷ Building image safety-guard models : strangerguardhf

⤷ Datasets :
→ nsfw multidomain classification : strangerguardhf/NSFW-MultiDomain-Classification
→ nsfw multidomain classification v2.0 : strangerguardhf/NSFW-MultiDomain-Classification-v2.0

⤷ Collection :
→ Updated Versions [05192025] : prithivMLmods/explicit-content-filters-682aaa4733e378561925ca2b
→ Previous Versions : prithivMLmods/siglip2-content-filters-042025-final-680fe4aa1a9d589bf2c915ff

Find a collections inside the collection.👆

To know more about it, visit the model card of the respective model.
  • 1 reply
·
prithivMLmods 
posted an update 13 days ago
view post
Post
2650
Models for detecting images generated by diffusion models (Flux.1, SDXL, ..) are trained or fine-tuned using image classification models for content moderation. These models use datasets available on the Hub. For identifying AI-generated images or moderating visual content, the recommended model is OpenSDI-Flux.1-SigLIP2.😺🧨

Models : prithivMLmods/OpenSDI-Flux.1-SigLIP2 [Best approach for AI [Diffusion Generated] vs. real image classification] prithivMLmods/OpenSDI-SD2.1-SigLIP2 prithivMLmods/OpenSDI-SD3-SigLIP2 prithivMLmods/OpenSDI-SD1.5-SigLIP2 prithivMLmods/OpenSDI-SDXL-SigLIP2

Datasets : nebula/OpenSDI_test madebyollin/megalith-10m

Collection : prithivMLmods/opensdi-diffusion-generated-image-classification-682488a3a3e5be7083db3383

Find a collections inside the collection.👆

To know more about it, visit the model card of the respective model.
prithivMLmods 
posted an update 14 days ago
view post
Post
1994
Dropping some image classification models for content moderation and classifiers trained with datasets available on the Hub. All are fine-tuned on the siglip2 backbone, (competitions AIOrNot, Imagenette, and Driver-Drowsiness). Models and datasets are listed below:

🤗Models :
AI or Not : prithivMLmods/AIorNot-SigLIP2
Driver Drowsiness Detection : prithivMLmods/DOZE-GUARD-RLDD
Subset 10 ImageNet : prithivMLmods/IMAGENETTE

🥊Datasets :
+ competitions/aiornot
+ akahana/Driver-Drowsiness-Dataset
+ frgfm/imagenette

🔗Collection :
[The previous collection of models is also listed in the same collection, so you can find more models focused on image classification tasks.]

- prithivMLmods/multiclass-image-classification-05142025-68234c8010a9350a4d6739b5

Find a collections inside the collection.🤪👆

To know more about it, visit the model card of the respective model.
prithivMLmods 
posted an update 18 days ago
view post
Post
3504
Dropping some image classification models for content moderation, balancers, and classifiers trained on synthetic datasets—along with others based on datasets available on the Hub. Also loaded a few low-rank datasets for realistic gender portrait classification and document-type classifiers, all fine-tuned on the SigLIP-2 Patch-16 224 backbone. Models and datasets are listed below:

🤗Models & Datasets :

Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
jonathan-roberts1/GID

🤗Collection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c

To know more about it, visit the model card of the respective model.
Nymbo 
posted an update 19 days ago
view post
Post
1928
Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?
  • 1 reply
·
prithivMLmods 
posted an update 22 days ago
view post
Post
3248
Well, here’s the updated version with the 20,000+ entry sampled dataset for Watermark Filter Content Moderation models incl. [Food25, Weather, Watermark, Marathi/Hindi Sign Language Detection], post-trained from the base models: sigLip2 patch16 224 — now with mixed aspect ratios for better performance and reduced misclassification. 🔥

Models :
➮ Watermark-Detection : prithivMLmods/Watermark-Detection-SigLIP2
⌨︎ Watermark Detection & Batch Image Processing Experimentals, Colab Notebook : https://colab.research.google.com/drive/1mlQrSsSjkGimUt0VyRi3SoWMv8OMyvw3?usp=drive_link
➮ Weather-Image-Classification : prithivMLmods/Weather-Image-Classification
➮ TurkishFoods-25 : prithivMLmods/TurkishFoods-25
➮ Marathi-Sign-Language-Detection : prithivMLmods/Marathi-Sign-Language-Detection
➮ Hindi-Sign-Language-Detection : prithivMLmods/Hindi-Sign-Language-Detection

Datasets :
Watermark : qwertyforce/scenery_watermarks
Weather : prithivMLmods/WeatherNet-05-18039
Turkish Foods 25 : yunusserhat/TurkishFoods-25
Marathi Sign Language : VinayHajare/Marathi-Sign-Language
Hindi Sign Language : Vedant3907/Hindi-Sign-Language-Dataset

Collection : prithivMLmods/content-filters-siglip2-vit-68197e3357d4de18fb3b4d2b
prithivMLmods 
posted an update 25 days ago
view post
Post
1161
The new versions of Midjourney Mix adapters have been dropped in stranger zone hf. These adapters excel in studio lighting portraits and painterly styles, trained using the style of strangerzonehf/Flux-Midjourney-Mix2-LoRA. They leverage 24-bit colored synthetic images generated form midjourney v6 to achieve high-quality image reproducibility and support adaptable aspect ratios, using Flux.1 as the base model. 🥳

Models [ ⌗ ]

> Flux-Midjourney-Painterly-LoRA : strangerzonehf/Flux-Midjourney-Painterly-LoRA
> Flux-Midjourney-Studio-LoRA : strangerzonehf/Flux-Midjourney-Studio-LoRA

> Collection : strangerzonehf/midjourney-mix-3-ft-flux1-dev-68165d58a2a08025852d63f3

> Space : prithivMLmods/FLUX-LoRA-DLC2

The best dimensions and inference settings for optimal results are as follows: A resolution of 1280 x 832 with a 3:2 aspect ratio is recommended for the best quality, while 1024 x 1024 with a 1:1 aspect ratio serves as the default option. For inference, the recommended number of steps ranges between 30 and 35 to achieve optimal output.
Nymbo 
posted an update 27 days ago
view post
Post
1777
PSA for anyone using Nymbo/Nymbo_Theme or Nymbo/Nymbo_Theme_5 in a Gradio space ~

Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and in-line code is readable now! Both themes are now visually identical across versions.

If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.
prithivMLmods 
posted an update 29 days ago
view post
Post
1931
Dropping downstream tasks using newly initialized parameters and weights supports domain-specific image classification post-training, based on the SigLIP-2 models: Patch-16/224, Patch-16/256, and Patch-32/256. For more details, please refer to the respective model cards : 🤗

+ watermark detection : prithivMLmods/Watermark-Detection-SigLIP2
+ resisc45 : prithivMLmods/RESISC45-SigLIP2
+ pacs dg : prithivMLmods/PACS-DG-SigLIP2
+ 3d printed or not : prithivMLmods/3D-Printed-Or-Not-SigLIP2
+ formula or text : prithivMLmods/Formula-Text-Detection

Categorizing Un-Safe Content :
- explicit content patch16 256 : prithivMLmods/siglip2-x256-explicit-content
- explicit content patch32 256 : prithivMLmods/siglip2-x256p32-explicit-content

Collection :
> SigLIP2 Content Filters 042025 Final : https://huggingface.co/collections/prithivMLmods/siglip2-content-filters-04202-final-680fe4aa1a9d589bf2c915ff
> SigLIP2 : google/siglip2-67b5dcef38c175486e240107
> SigLIP2 Multilingual Vision-Language Encoders : https://arxiv.org/pdf/2502.14786
prithivMLmods 
posted an update about 1 month ago
view post
Post
2298
Bringing out style-intermixing adapters for Flux.Dev, including Aura Glow, Fallen Ink Art, Cardboard Paper Arts, Black & White Expressions, and Glitter Gem Touch. For more details, visit the model card of the LoRA. 🥳

╰┈➤Demo : prithivMLmods/FLUX-LoRA-DLC2 & prithivMLmods/FLUX-LoRA-DLC

╰┈➤ Adapters :
+ Aura Glow : strangerzonehf/2DAura-Flux
+ Fallen Ink Art : strangerzonehf/FallenArt-Flux
+ Black & White Expressions : strangerzonehf/BnW-Expressions-Flux
+ Glitter Gem Touch : strangerzonehf/Gem-Touch-LoRA-Flux
+ Cardboard Paper Arts v1 : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Cardboard Paper Arts v2 : strangerzonehf/Cardboard-v2-Flux

╰┈➤ Pages :
- Repository Page : strangerzonehf
- Collection : strangerzonehf/mixer-adp-042025-68095c365d9d1072c8d860be
- Flux Ultimate LoRA Collection : strangerzonehf/Flux-Ultimate-LoRA-Collection
- By prithivMLmods : @prithivMLmods

The best dimensions and inference settings for optimal results are as follows: A resolution of 1280 x 832 with a 3:2 aspect ratio is recommended for the best quality, while 1024 x 1024 with a 1:1 aspect ratio serves as the default option. For inference, the recommended number of steps ranges between 30 and 35 to achieve optimal output.
prithivMLmods 
posted an update about 1 month ago
view post
Post
1283
Dropping the domain-specific downstream image classification content moderation models, including the anime image type classification, GeoSceneNet, indoor-outdoor scene classification, and black-and-white vs. colored image classification models, along with the datasets. 🔥

╰┈➤Models :
+ GeoSceneNet : prithivMLmods/Multilabel-GeoSceneNet
+ IndoorOutdoorNet : prithivMLmods/IndoorOutdoorNet
+ B&W vs Colored : prithivMLmods/BnW-vs-Colored-Detection
+ Anime Image Type : prithivMLmods/Anime-Classification-v1.0
+ Multilabel Portrait : prithivMLmods/Multilabel-Portrait-SigLIP2

╰┈➤Datasets :
- GeoSceneNet : prithivMLmods/Multilabel-GeoSceneNet-16K
- IndoorOutdoorNet : prithivMLmods/IndoorOutdoorNet-20K
- BnW vs Colored : prithivMLmods/BnW-vs-Colored-10K
- Multilabel Portrait : prithivMLmods/Multilabel-Portrait-18K

╰┈➤Collections :
> Multilabel Image Classification Datasets : prithivMLmods/multilabel-image-classification-datasets-6809aa64637f45d4c47fa6ca
> Model Collection : prithivMLmods/siglip2-content-filters-models-v2-68053a958c42ef17a3a3f4d1

Note: The anime scene type dataset is not mentioned in the list because it is private and only accessible to members of the DeepGHS organization.

For raw ZIP files or more information about the datasets, visit: https://www.kaggle.com/prithivsakthiur/datasets
  • 1 reply
·
prithivMLmods 
posted an update about 1 month ago
view post
Post
2901
Dropping an entire collection of Style Intermixing Adapters on StrangerZone HF — including Realism, Anime, Sketch, Texture-Rich 3D Experimentals, Automotive Concept Images, and LoRA models based on Flux.1, SD 3.5 Turbo/Large, Stable Diffusion XL 🎨

╰┈➤Collection :
➜ sketch : strangerzonehf/sketch-fav-675ba869c7ceaec7e652ee1c
➜ sketch2 : strangerzonehf/q-series-sketch-678e3503bf3a661758429717
➜ automotive : strangerzonehf/automotive-3d-675bb31a491d8c264d45d843
➜ texture 3d : strangerzonehf/flux-3dxl-engine-674833c14a001d5b1fdb5139
➜ super 3d : strangerzonehf/super-3d-engine-6743231d69f496df97addd2b
➜ style mix : strangerzonehf/mixer-engine-673582c9c5939d8aa5bf9533
➜ realism : strangerzonehf/realism-engine-67343495b6daf0fbdb904cc1

╰┈➤The Entire Collection :
➜ flux.1 : prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be
➜ flux-ultimate-lora-collection : strangerzonehf/Flux-Ultimate-LoRA-Collection
➜ sd 3.5 large / turbo : prithivMLmods/sd-35-large-lora-671b39d7bc2e7f71a446b163
➜ sdxl : prithivMLmods/sdxl-dev-models-667803a6d5ac75b59110e527

╰┈➤Pages :
➜ page 1: strangerzonehf
➜ page 2: @prithivMLmods
➜ demo : prithivMLmods/FLUX-LoRA-DLC

.🤗
prithivMLmods 
posted an update about 1 month ago
view post
Post
2612
Try out the demo for Multimodal OCR featuring the implementation of models including RolmOCR and Qwen2VL OCR. The use case showcases image-text-to-text conversion and video understanding support for the RolmOCR model ! 🚀

🤗Multimodal OCR Space : prithivMLmods/Multimodal-OCR

📦The models implemented in this Space are:
+ Qwen2VL OCR : prithivMLmods/Qwen2-VL-OCR-2B-Instruct [ or ]
+ Qwen2VL OCR2 : prithivMLmods/Qwen2-VL-OCR2-2B-Instruct
+ RolmOCR : reducto/RolmOCR

Qwen2VL OCR supports only image-text-to-text in the space.
prithivMLmods 
posted an update about 2 months ago
view post
Post
3384
Loaded some domain-specific downstream image classification content moderation models, which is essentially the practice of monitoring and filtering user-generated content on platforms, based on SigLIP-2 Base Patch16 with newly initialized trainable parameters. 🥠

+ Age-Classification-SigLIP2 : prithivMLmods/Age-Classification-SigLIP2
[ Age range classification from 0 to 65+ years ]
+ Facial-Emotion-Detection-SigLIP2 : prithivMLmods/Facial-Emotion-Detection-SigLIP2
[ Designed to classify different facial emotions ]
+ Hand-Gesture-2-Robot : prithivMLmods/Hand-Gesture-2-Robot
[ Human Hand Gesture Classification for Robot Control ]
+ Mature-Content-Detection : prithivMLmods/Mature-Content-Detection
[ Mature [adult] or neutral content categories ]
+ Vit-Mature-Content-Detection : prithivMLmods/Vit-Mature-Content-Detection
[ Mature [adult] or neutral content categories ft. ViT]
+ Human-Action-Recognition : prithivMLmods/Human-Action-Recognition
[ Human actions including clapping, sitting, running, and more ]
+ Mirage-Photo-Classifier : prithivMLmods/Mirage-Photo-Classifier
[ Whether an image is real or AI-generated (fake) ]
+ Food-101-93M : prithivMLmods/Food-101-93M
[ Classify food images into one of 101 popular dishes ]
+ Hand-Gesture-19 : prithivMLmods/Hand-Gesture-19
[ Classify hand gesture images into different categories ]
+ Trash-Net : prithivMLmods/Trash-Net
[ Classification of trash into six distinct categories ]
+ Gender-Classifier-Mini : prithivMLmods/Gender-Classifier-Mini
[ Classify images based on gender [Male / Female] ]

🎡Collections :

+ SigLIP2 Content Filters : https://huggingface.co/collections/prithivMLmods/siglip2-content-filters-models-67f001055ec2bed56ca41f6d
AtAndDev 
posted an update about 2 months ago
view post
Post
2993
Llama 4 is out...
·