19 7 14

weiguosun

chuangxinlezhi

AI & ML interests

None yet

Recent Activity

upvoted a collection 14 days ago

Llama 3.3

reacted to CultriX's post with 👍 about 2 months ago

Final upgrade to the Multi-Agent Task Completion Space: https://huggingface.co/spaces/CultriX/MultiAgent-CodeTask . It now includes : - a live stream of the progress being made on the task (see included video), - The following components: 1. Automatic prompt optimization 2. An orchestrator deciding which agent to call dynamically including feedback from a human (human-in-the-loop) 3. A coding agent to complete the task 4. A code reviewing agent to iteratively provide feedback to improve the code generated by the coding agent until the code meets the required criteria after which it is approved. 5. A testing agent that tests the approved code or provides information on how to test it. 6. A documentation agent that provides documentation and a help message for the approved and tested code.

reacted to sayakpaul's post with 👀 2 months ago

We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨 We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥 5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗 https://huggingface.co/blog/video_gen

View all activity

Organizations

chuangxinlezhi's activity

upvoted a collection 14 days ago

Llama 3.3

Collection

This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 150

reacted to CultriX's post with 👍 about 2 months ago

Post

2486

Final upgrade to the Multi-Agent Task Completion Space: CultriX/MultiAgent-CodeTask .

It now includes :
- a live stream of the progress being made on the task (see included video),
- The following components:
1. Automatic prompt optimization
2. An orchestrator deciding which agent to call dynamically including feedback from a human (human-in-the-loop)
3. A coding agent to complete the task
4. A code reviewing agent to iteratively provide feedback to improve the code generated by the coding agent until the code meets the required criteria after which it is approved.
5. A testing agent that tests the approved code or provides information on how to test it.
6. A documentation agent that provides documentation and a help message for the approved and tested code.

reacted to sayakpaul's post with 👀➕🤯🤝 2 months ago

Post

1995

We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥

5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗
https://huggingface.co/blog/video_gen

upvoted a collection 3 months ago

my_new_post

Collection

none • 1 item • Updated Jan 13 • 1

New activity in chuangxinlezhi/1212121 3 months ago

Update README.md

#6 opened 3 months ago by

chuangxinlezhi

updated a model 3 months ago

chuangxinlezhi/1212121

Updated Dec 30, 2024

New activity in chuangxinlezhi/1212121 3 months ago

Update README.md

#5 opened 3 months ago by

chuangxinlezhi

Update README.md

#4 opened 3 months ago by

chuangxinlezhi

newstar

#3 opened 3 months ago by

chuangxinlezhi

liked a model 4 months ago

Lightricks/LTX-Video

Text-to-Video • Updated 20 days ago • 173k • • 1.11k

updated a model 4 months ago

shuttleai/shuttle-3-diffusion

Text-to-Image • Updated Nov 23, 2024 • 108k • 193

New activity in shuttleai/shuttle-3-diffusion 4 months ago

Update README.md

#12 opened 4 months ago by

chuangxinlezhi

Update README.md

#11 opened 4 months ago by

chuangxinlezhi

updated a dataset 4 months ago

VTSNLP/instruct_general_dataset

Viewer • Updated Sep 30, 2024 • 4.53M • 262 • 34

New activity in VTSNLP/instruct_general_dataset 4 months ago

Update README.md

#3 opened 4 months ago by

chuangxinlezhi

liked a model 4 months ago

Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • Updated Jan 12 • 457k • • 1.75k

reacted to m-ric's post with 🚀 5 months ago

Post

1638

𝗔𝗻𝗱𝗿𝗼𝗶𝗱𝗟𝗮𝗯: 𝗙𝗶𝗿𝘀𝘁 𝗲𝘃𝗲𝗿 𝘀𝘆𝘀𝘁𝗲𝗺𝗮𝘁𝗶𝗰 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸 𝗳𝗼𝗿 𝗔𝗻𝗱𝗿𝗼𝗶𝗱 𝗺𝗼𝗯𝗶𝗹𝗲 𝗮𝗴𝗲𝗻𝘁𝘀 𝘀𝗵𝗼𝘄𝘀 𝘁𝗵𝗮𝘁 𝘀𝗺𝗮𝗹𝗹, 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗲𝗱 𝗼𝗽𝗲𝗻 𝗺𝗼𝗱𝗲𝗹𝘀 𝗰𝗮𝗻 𝗽𝗼𝘄𝗲𝗿 𝗮 𝗝𝗔𝗥𝗩𝗜𝗦 𝘀𝘆𝘀𝘁𝗲𝗺 𝗼𝗻 𝘆𝗼𝘂𝗿 𝘀𝗺𝗮𝗿𝘁𝗽𝗵𝗼𝗻𝗲 📱🔥

A team from Tsinghua University just released AndroidLab, the first systematic framework to evaluate and train Android mobile agents that works with both text-only and multimodal models.

They show that fine-tuning small open-source models can significantly boost performance, matching that of much bigger closed models like GPT-4o.

The team built:

📊 A reproducible benchmark with 138 tasks across 9 apps to evaluate mobile agents systematically

📝📱 A framework supporting both text-only (via XML) and visual (via marked screenshots) interfaces

✅ An instruction dataset of 10.5k operation traces for training mobile agents

Key insights:

- 📈 Fine-tuning improves performance BY A LOT: Open-source model Llama-3.1-8B improves from 2% to 24% success rate after training, nearly reaching GPT-4o performance although it’s much smaller
- ⚙️ Text-only agents match multimodal ones: XML-based agents achieve similar performance to screenshot-based multimodal agents.

Read their paper here 👉 AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents (2410.24024)