Social Post Explorers

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

social-post-explorers's activity

MonsterMMORPG 
posted an update 1 day ago
view post
Post
2529
I am doing a workflow research for a company and our Ultimate Image Processing tool is being extremely helpful. You can auto zoom / crop into desired aspect ratio with using prompts (like a shoe) via SAM2 that we have in our batch processing app.

Gradio based App link : https://www.patreon.com/posts/120352012


MonsterMMORPG 
posted an update 4 days ago
view post
Post
1347
Extending Wan 2.1 generated video - First 14b 720p text to video, then using last frame automatically to to generate a video with 14b 720p image to video - with RIFE 32 FPS 10 second 1280x720p video

Our app has this fully automated : https://www.patreon.com/posts/123105403

Here how it works image : https://ibb.co/b582z3R6

Workflow is easy

Use your favorite app to generate initial video.

Get last frame

Give last frame to image to video model - with matching model and resolution

Generate

And merge

Then use MMAudio to add sound

I made it automated in my Wan 2.1 app but can be made with ComfyUI easily as well . I can extend as many as times i want :)

Here initial video

Prompt: Close-up shot of a Roman gladiator, wearing a leather loincloth and armored gloves, standing confidently with a determined expression, holding a sword and shield. The lighting highlights his muscular build and the textures of his worn armor.

Negative Prompt: Overexposure, static, blurred details, subtitles, paintings, pictures, still, overall gray, worst quality, low quality, JPEG compression residue, ugly, mutilated, redundant fingers, poorly painted hands, poorly painted faces, deformed, disfigured, deformed limbs, fused fingers, cluttered background, three legs, a lot of people in the background, upside down

Used Model: WAN 2.1 14B Text-to-Video

Number of Inference Steps: 20

CFG Scale: 6

Sigma Shift: 10

Seed: 224866642

Number of Frames: 81

Denoising Strength: N/A

LoRA Model: None

TeaCache Enabled: True

TeaCache L1 Threshold: 0.15

TeaCache Model ID: Wan2.1-T2V-14B

Precision: BF16

Auto Crop: Enabled

Final Resolution: 1280x720

Generation Duration: 770.66 seconds



·
MonsterMMORPG 
posted an update 5 days ago
view post
Post
785
MMAudio Full Tutorial - Open Source AI Audio Generator for Videos - Useful for Games and AI Videos

Full tutorial link : https://youtu.be/504f8S4MLTw

GitHub repo : https://github.com/hkchengrex/MMAudio

MMAudio is the currently state of the art (SOTA) open source free to use AI model to generate sounds for videos, images and text prompts. It is so amazing and high quality and extremely useful to generate sound effects for your AI videos, game assets, or any project where you need specific or free sound effects. In this step by step tutorial I will show you how to install and use this amazing model on your Windows computer with 1-click installation and extremely easy to use Gradio App. My app and installation supports RTX 5000 series GPUs as well as older GPUs. Moreover, I am sharing scripts to 1-click install on Cloud services such as RunPod, Massed Compute and a free Kaggle account notebook. Enjoy.

🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️
▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-117990364

🔗 Mandatory Requirements Tutorial⤵️
▶️ https://youtu.be/DrhUHnYfwC0

Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

MMAudio generates synchronized audio given video and/or text inputs. Our key innovation is multimodal joint training which allows training on a wide range of audio-visual and audio-text datasets. Moreover, a synchronization module aligns the generated audio with the video frames.

MonsterMMORPG 
posted an update 7 days ago
view post
Post
1011
Prepared presets for Wan 2.1 for every model and GPU with modelscope / DiffSynth-Studio - Works with maximum speed as long as you are not using more than 2 GB VRAM - Compared BF16 vs FP8 as well

Our app tutorial main : https://youtu.be/hnAhveNy-8s

2nd tutorial : https://youtu.be/ueMrzmbdWBg

Our App : https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-123105403

Also our App now has fully updated presets for every GPU both for BF16 and FP8 precision
MonsterMMORPG 
posted an update 9 days ago
MonsterMMORPG 
posted an update 10 days ago
MonsterMMORPG 
posted an update 11 days ago
view post
Post
1918
I just pushed another amazing update to our Wan 2.1 APP. LoRA loading for 14B Wan 2.1 models were taking over 15 minutes. Optimized to take only few seconds now. Fully supports RTX 5000 series and fully optimized for both VRAM and RAM.

Our APP here : https://www.patreon.com/posts/wan-2-1-ultra-as-123105403

Tutorial 1 : https://youtu.be/hnAhveNy-8s

Tutorial 2 : https://youtu.be/ueMrzmbdWBg

It is also pushed to the original repo you can see pull request here : https://github.com/modelscope/DiffSynth-Studio/pull/442

freddyaboulton 
posted an update 12 days ago
view post
Post
1834
Privacy matters when talking to AI! 🔇

We've just added a microphone mute button to FastRTC in our latest update (v0.0.14). Now you control exactly what your LLM hears.

Plus lots more features in this release! Check them out:
https://github.com/freddyaboulton/fastrtc/releases/tag/0.0.14
MonsterMMORPG 
posted an update 13 days ago
view post
Post
826
Ultra Advanced Wan 2.1 App Updates & Famous Squish Effect to Generate Squishing Videos Locally : https://youtu.be/ueMrzmbdWBg

Tutorial Link : https://youtu.be/ueMrzmbdWBg

Squish Effect LoRA arrived to Wan 2.1. Wan 2.1 is the truly State of the Art (SOTA) Open Source video generation model that supports Text to Video (T2V), Video to Video (V2V) and Image to Video (I2V). Now our ultra advanced 1-Click Gradio application supports LoRAs and today I will show you all the new developments to our Wan 2.1 all in one video generation Gradio App. We have added so many new features since the original Wan 2.1 step by step tutorial and we continue to improve our App on a daily bases with amazing updates.

If you want to have Squish it: AI Squish Video Art locally for free forever, our app and Squish LoRA and Wan 2.1 is all you need. Watch this tutorial to learn all. Moreover this tutorial will show you majority of the newest features we have implemented with non-stop working for 10 days.

Hopefully many more updates coming soon.
MonsterMMORPG 
posted an update 18 days ago
Undi95 
posted an update 19 days ago
view post
Post
5079
Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!
·
MonsterMMORPG 
posted an update 21 days ago
view post
Post
3246
Wan 2.1 AI Video Model: Ultimate Step-by-Step Tutorial for Windows & Affordable Private Cloud Setup : https://youtu.be/hnAhveNy-8s

https://youtu.be/hnAhveNy-8s

Please check all screenshots to see latest news and updates after the tutorial video

Alibaba’s new Wan 2.1 text-to-video, video-to-video and image-to-video Open Source AI is unbelievable. In this tutorial I will show how you can install Wan 2.1 all publicly published models into your Windows PC with 1-click installation and use them with the easiest possible way. With the Gradio APP I have developed, you will be able to use Wan AI with as low as 3.5GB VRAM having GPUs. Furthermore, for those who want to utilize powerful private cloud GPUs with cheapest possible prices, I will show how to 1-click and install Wan 2.1 on Massed Compute and on RunPod. Additionally, I will compare performance of RTX 3090 TI with RTX 5090 on all Wan 2.1 models. You will be shocked to see performance of RTX 5090. Also the APP I developed supports all RTX 5000 series on Windows with Python VENV natively. You don't need Linux or WSL.

🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️
▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-123105403

🔗 SECourses Official Discord 9500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 SECourses Official Reddit - Stay Subscribed To Learn All The News and More ⤵️
▶️ https://www.reddit.com/r/SECourses/

🔗 MSI RTX 5090 TRIO FurMark Benchmarking + Overclocking + Noise Testing and Comparing with RTX 3090 TI ⤵️
▶️ https://youtu.be/uV3oqdILOmA

🔗 RTX 5090 Tested Against FLUX DEV, SD 3.5 Large, SD 3.5 Medium, SDXL, SD 1.5, AMD 9950X + RTX 3090 TI ⤵️
▶️ https://youtu.be/jHlGzaDLkto
  • 1 reply
·
MonsterMMORPG 
posted an update 26 days ago
view post
Post
2382
Wan 2.1 Ultra Advanced Gradio APP for - Works as low as 4GB VRAM - 1-Click Installers for Windows, RunPod, Massed Compute - Batch Processing - T2V - I2V - V2V

Installer and APP : https://www.patreon.com/posts/123105403

Download from here : https://www.patreon.com/posts/123105403

I have been working 14 hours today to make this APP before sleeping for you guys :)

We have all the features of Wan 2.1 model

Text to Video 1.3B (as low as 3.5 GB VRAM) - Really fast - 480x832px or 832x480px

Video to Video 1.3B (as low as 3.5 GB VRAM) - Really fast - 480x832px or 832x480px

Text to Video 14B (as low as 17 GB VRAM) - still may work at below VRAM but slower - 720x1280px or 1280x720px

Image to Video 14B (as low as 17 GB VRAM) - still may work at below VRAM but slower - 720x1280px or 1280x720px

When you analyze the above and below images
First video is animated from the input image with following prompt

A hooded wraith stands motionless in a torrential downpour, lightning cracking across the stormy sky behind it. Its face is an impenetrable void of darkness beneath the tattered hood. Rain cascades down its ragged, flowing cloak, which appears to disintegrate into wisps of shadow at the edges. The mysterious figure holds an enormous sword of pure energy, crackling with electric blue lightning that pulses and flows through the blade like liquid electricity. The weapon drags slightly on the wet ground, sending ripples of power across the puddles forming at the figure's feet. Three glowing blue gems embedded in its chest pulse in rhythm with the storm's lightning strikes, each flash illuminating the decaying, ancient fabric of its attire. The rain intensifies around the figure, droplets seemingly slowing as they near the dark entity, while forks of lightning repeatedly illuminate its imposing silhouette. The atmosphere grows heavier with each passing moment as the wraith slowly raises its crackling blade, the blue energy intensifying and casting eerie shadows

  • 3 replies
·
freddyaboulton 
posted an update 27 days ago
view post
Post
3194
Getting WebRTC and Websockets right in python is very tricky. If you've tried to wrap an LLM in a real-time audio layer then you know what I'm talking about.

That's where FastRTC comes in! It makes WebRTC and Websocket streams super easy with minimal code and overhead.

Check out our org: hf.co/fastrtc