Contains Flux Fill, Canny, and Dev checkpoints in NF4.
Sayak Paul
sayakpaul
AI & ML interests
Diffusion models, representation learning
Recent Activity
commented on
a paper
about 6 hours ago
Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for
Long Video Generation
upvoted
a
paper
about 6 hours ago
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for
Long Video Generation
reacted
to
burtenshaw's
post
with ❤️
about 9 hours ago
Inference for generative ai models looks like a mine field, but there’s a simple protocol for picking the best inference:
🌍 95% of users >> If you’re using open (large) models and need fast online inference, then use Inference providers on auto mode, and let it choose the best provider for the model. https://huggingface.co/docs/inference-providers/index
👷 fine-tuners/ bespoke >> If you’ve got custom setups, use Inference Endpoints to define a configuration from AWS, Azure, GCP. https://endpoints.huggingface.co/
🦫 Locals >> If you’re trying to stretch everything you can out of a server or local machine, use Llama.cpp, Jan, LMStudio or vLLM. https://huggingface.co/settings/local-apps#local-apps
🪟 Browsers >> If you need open models running right here in the browser, use transformers.js. https://github.com/huggingface/transformers.js
Let me know what you’re using, and if you think it’s more complex than this.