Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
2
Ashish Sadhwani
2ashishs
Follow
0 followers
·
5 following
2ashishs
AI & ML interests
None yet
Recent Activity
liked
a model
18 days ago
nanonets/Nanonets-OCR-s
replied
to
a-r-r-o-w
's
post
23 days ago
Recently, I've been focusing my learning on the following topics: - Pytorch internals, specifically the inductor system (roughly ~1 month of experience) - Triton internals (~8 moe) - CUDA (~3 moe) - Understanding fusion patterns in compilers and how to improve them (~1 moe) - Parallelism strategies for large scale inference optimization (~6-7 moe) I thought it would be nice to document it somewhere for no particular reason. Maybe someone will find it useful? It's also because I want to get into the habit of writing, but had no motivation to do so. Maybe writing short informal posts will help build the habit. Since I don't have a personal site, and don't plan to create one in the near future, I think HF posts are best suited for short and informal documentation to share my little discoveries and learnings. If you're interested, strap in! First post in this series will be on basic study of Pytorch's float32 matmuls and their Triton implementation (nothing much, just the tutorial available on the website), short dive into TF32 and their TFLOPS comparison on an A100 machine.
upvoted
an
article
about 1 month ago
Turning Home Assistant into an AI Powerhouse: Amy's Guide
View all activity
Organizations
None yet
2ashishs
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
18 days ago
nanonets/Nanonets-OCR-s
Image-Text-to-Text
•
4B
•
Updated
16 days ago
•
266k
•
1.34k
liked
a model
over 1 year ago
tiiuae/falcon-7b-instruct
Text Generation
•
7B
•
Updated
Oct 12, 2024
•
139k
•
989