view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr β’ Feb 7 β’ 176
view article Article How to Build an MCP Server with Gradio By abidlabs and 1 other β’ Apr 30 β’ 176
TxGemma Release Collection Collection of open models to accelerate the development of therapeutics. β’ 5 items β’ Updated May 30 β’ 60
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory β’ 15 items β’ Updated May 30 β’ 202
HoneyBee Collection A collection of public multimodal oncology datasets generated using the HoneyBee framework. β’ 6 items β’ Updated Aug 15, 2024 β’ 4
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others β’ Jul 16, 2024 β’ 386
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published Dec 18, 2024 β’ 150
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others β’ Dec 19, 2024 β’ 662
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. β’ 121 items β’ Updated Jan 31, 2024 β’ 550
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth By mlabonne β’ Jul 29, 2024 β’ 347
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. β’ 11 items β’ Updated May 1 β’ 61
view article Article Predicting the Effects of Mutations on Protein Function with ESM-2 By AmelieSchreiber β’ Dec 13, 2023 β’ 19
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper β’ 2305.18290 β’ Published May 29, 2023 β’ 60
view article Article A Guide to Designing New Functional Proteins and Improving Protein Function, Stability, and Diversity with Generative AI By AmelieSchreiber β’ Jul 2, 2024 β’ 45
view article Article π¦βοΈ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero β’ Jun 4, 2024 β’ 79