ligeng-dev/tw-data-train_final_v2_nb2_mt8192_replaced_fix-8node-resume Text Generation • 8B • Updated about 10 hours ago
ligeng-dev/tw-data-train_classified-8node-resume Text Generation • 8B • Updated about 10 hours ago
ligeng-dev/tw-data-train_final_replaced_from_classified-fix-format-8node-resume Text Generation • 8B • Updated about 11 hours ago
ligeng-dev/tw-data-train_final_v2_nb2_mt8192_replaced_fix-8node-resume Text Generation • 8B • Updated about 10 hours ago
ligeng-dev/tw-data-train_classified-8node-resume Text Generation • 8B • Updated about 10 hours ago
ligeng-dev/tw-data-train_final_replaced_from_classified-fix-format-8node-resume Text Generation • 8B • Updated about 11 hours ago
ligeng-dev/q3-8b-train_final_v2_nb2_mt8192_replaced_fix Text Generation • 8B • Updated 2 days ago • 159
ligeng-dev/q3-8b-train_final_v2_nb2_mt8192_replaced_fix Text Generation • 8B • Updated 2 days ago • 159
ligeng-dev/Q3-8B-131072-sft-1x-20260331_091938 Text Generation • 8B • Updated 13 days ago • 895
ligeng-dev/Q3-8B-131072-sft-1x-20260331_091938 Text Generation • 8B • Updated 13 days ago • 895
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19, 2024 • 52
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Paper • 2409.04429 • Published Sep 6, 2024
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers Paper • 2410.10629 • Published Oct 14, 2024 • 13