Troy Baker's picture

8 49

Troy Baker

jtroybaker

·

AI & ML interests

Predictive Maintenance, Reinforcement Learning, Natural Language Processing

Recent Activity

reacted to burtenshaw's post with 👍 about 12 hours ago

Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model. The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect. ✔️ training running ✔️ evals running ⏭️ improve dataset The moe isn't going to fit into colab's A100 even with quantization (🙏 @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow. https://huggingface.co/burtenshaw/Qwen3-Code-Lite/discussions/1

liked a model about 22 hours ago

Wan-AI/Wan2.1-VACE-14B

liked a model 1 day ago

DavidAU/Qwen3-30B-A6B-16-Extreme

View all activity

Organizations

jtroybaker's activity

upvoted a paper about 1 month ago

YourBench: Easy Custom Evaluation Sets for Everyone

Paper • 2504.01833 • Published Apr 2 • 20

upvoted a collection 7 months ago

MIT Talk 31/10 Papers

14 items • Updated Oct 28, 2024 • 32

upvoted a collection 10 months ago

Gemma 2 2B Release

The 2.6B parameter version of Gemma 2. • 6 items • Updated Apr 3 • 79