lukaemon (Lucas Shen)

liked a model 2 months ago

unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF

31B • Updated Jul 31 • 70.1k • 242

liked a Space 8 months ago

3.28k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a Space 10 months ago

585

Scaling test-time compute

📈

Implement test-time compute scaling for math problems

liked 2 datasets about 1 year ago

stas/openwebtext-10k

Updated Sep 15, 2021 • 783 • 31

NeelNanda/pile-10k

Viewer • Updated Oct 14, 2022 • 10k • 8.41k • 23

liked a model about 1 year ago

google/gemma-scope

Updated Aug 29, 2024 • 183

liked 2 models over 1 year ago

microsoft/Phi-3-mini-128k-instruct

Text Generation • 4B • Updated Mar 2 • 360k • 1.68k

microsoft/Phi-3-mini-4k-instruct

Text Generation • 4B • Updated Sep 20, 2024 • 1.12M • 1.3k

liked a dataset over 1 year ago

HuggingFaceM4/OBELICS

Viewer • Updated Aug 22, 2023 • 276M • 4.6k • 159

liked 3 models over 1 year ago

liked 2 datasets over 1 year ago

kakaobrain/coyo-700m

Viewer • Updated Aug 30, 2022 • 747M • 1.93k • 151

vikhyatk/lnqa

Viewer • Updated Aug 18, 2024 • 303k • 1.24k • 86

liked 2 models over 1 year ago

stabilityai/stablelm-2-zephyr-1_6b

Text Generation • 2B • Updated Jun 3, 2024 • 4.74k • 186

stabilityai/stablelm-2-1_6b

Text Generation • 2B • Updated Jul 10, 2024 • 2.22k • 192

liked 3 datasets over 1 year ago

bigcode/the-stack-v2

Viewer • Updated Apr 23, 2024 • 5.45B • 6.07k • 412

data-is-better-together/10k_prompts_ranked

Viewer • Updated Mar 7, 2024 • 10.3k • 401 • 164

FreedomIntelligence/ALLaVA-4V

Viewer • Updated Jun 8 • 143k • 1.14k • 90

liked a model over 1 year ago

google/gemma-7b-it

Text Generation • 9B • Updated Aug 14, 2024 • 180k • 1.21k

Lucas Shen

AI & ML interests

Organizations

unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF

The Ultra-Scale Playbook

Scaling test-time compute

stas/openwebtext-10k

NeelNanda/pile-10k

google/gemma-scope

microsoft/Phi-3-mini-128k-instruct

microsoft/Phi-3-mini-4k-instruct

HuggingFaceM4/OBELICS

apple/DFN5B-CLIP-ViT-H-14-378

apple/DFN2B-CLIP-ViT-L-14

1bitLLM/bitnet_b1_58-3B

kakaobrain/coyo-700m

vikhyatk/lnqa

stabilityai/stablelm-2-zephyr-1_6b

stabilityai/stablelm-2-1_6b

bigcode/the-stack-v2

data-is-better-together/10k_prompts_ranked

FreedomIntelligence/ALLaVA-4V

google/gemma-7b-it

Lucas Shen

AI & ML interests

Organizations

lukaemon's activity

The Ultra-Scale Playbook

Scaling test-time compute