Mariusz Kurman's picture

Mariusz Kurman

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

liked a dataset 21 days ago
UCSC-VLAA/MedReason
liked a dataset about 1 month ago
FreedomIntelligence/medical-o1-reasoning-SFT
liked a dataset about 1 month ago
ThinkAgents/Function-Calling-with-Chain-of-Thoughts
View all activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

Posts 17

view post
Post
878
Just released NVAMP Loss!

āœ”ļø modification of the cross-entropy loss function designed specifically for training LLMs.
āœ”ļø twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
āœ”ļø more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! šŸ”„šŸ¤–

https://github.com/mkurman/nvamp-loss