Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
10086
14
222
Tien Dung
tiendung
Follow
alpayariyak's profile picture
nibeditad07's profile picture
lunarflu's profile picture
13 followers
·
114 following
tiendung
AI & ML interests
None yet
Recent Activity
liked
a model
27 days ago
SparseLLM/BlockFFN-3B-SFT
liked
a model
about 1 month ago
turboderp/ERNIE-4.5-300B-A47B-PT-exl3
reacted
to
Jaward
's
post
with 😎
about 1 month ago
I played around with the new RXTX paper (XX^T) and was able to train nanogpt with 4x4 RXTX matmuls in both attention layer and optimizer🤕 It just works (well I had to add some guardrails) but still saves 5% of memory usage: The Patch: - Computes attention scores with a 4x4 blockwise RXTX matmuls (no pytorch dot prod) - Handles arbitrary sequence lengths by padding to the nearest multiple of 4. - An RXTX variant of shampoo with params reshaped into 4x4 blocks during each optimizer step. - Uses 5% less ops Code: https://github.com/Jaykef/ai-algorithms/blob/main/nanogpt-rxtx.ipynb Paper: https://arxiv.org/pdf/2505.09814
View all activity
Organizations
tiendung
's datasets
3
Sort: Recently updated
tiendung/cc-vi_truyen-filters
Preview
•
Updated
Oct 3, 2023
•
3
tiendung/cc-vi_domains
Preview
•
Updated
Sep 21, 2023
•
3
tiendung/chai
Viewer
•
Updated
Sep 15, 2023
•
70.8k
•
21