Burning ray's picture

1 2 21

Burning ray

adarksky

·

aeryskyB

AI & ML interests

None yet

Recent Activity

updated a model 3 days ago

hexgrad/Kokoro-82M

new activity 3 days ago

hexgrad/Kokoro-82M:Update kokoro.py

liked a model 4 days ago

hexgrad/Kokoro-82M

View all activity

Organizations

adarksky's activity

updated a model 3 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated about 9 hours ago • 22.6k • 1.9k

New activity in hexgrad/Kokoro-82M 3 days ago

Update kokoro.py

#43 opened 3 days ago by

liked a model 4 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated about 9 hours ago • 22.6k • 1.9k

liked a model 22 days ago

deepseek-ai/Janus-1.3B

Any-to-Any • Updated Nov 14, 2024 • 11.2k • 508

reacted to merve's post with 🔥 about 2 months ago

Post

2665

small but mighty 🔥
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM 🫰🏻 also with gradient accumulation simulated batch size is 16 ✨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work 💝 https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb

liked a model about 2 months ago

Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • Updated 6 days ago • 203k • • 1.49k

updated a model 2 months ago

adarksky/pokemon-DDPM

Unconditional Image Generation • Updated Nov 11, 2024 • 5

liked a model 2 months ago

tencent/Tencent-Hunyuan-Large

Text Generation • Updated Nov 24, 2024 • 61 • 545

updated a model 2 months ago

adarksky/bart-base-rel-therapy

Text2Text Generation • Updated Nov 11, 2024 • 7

liked a dataset 2 months ago

ChilleD/SVAMP

Viewer • Updated Jun 5, 2024 • 1k • 749 • 8

liked 4 datasets 3 months ago

deepmind/math_dataset

Updated Jan 18, 2024 • 3.3k • 110

deepmind/code_contests

Viewer • Updated Jun 11, 2023 • 4.04k • 6.78k • 130

Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 7.1k • 1.25k

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 173k • 492

liked a model 3 months ago

Mozilla/Llama-3.2-3B-Instruct-llamafile

Updated 13 days ago • 1.33k • 46

upvoted a collection 4 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated 12 days ago • 293

liked a model 4 months ago

Vchitect/Vchitect-2.0-2B

Text-to-Video • Updated Sep 15, 2024 • 33 • 37

upvoted a paper 4 months ago

Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos

Paper • 2409.08353 • Published Sep 12, 2024 • 11

liked a dataset 5 months ago

huggan/pokemon

Viewer • Updated Apr 1, 2022 • 7.36k • 57 • 20

liked a model 6 months ago

meta-llama/Llama-3.1-8B-Instruct

Text Generation • Updated Sep 25, 2024 • 5.73M • • 3.47k