Cuiunbo PRO

Cuiunbo

AI & ML interests

Anything

Recent Activity

reacted to merve's post with šŸ¤— about 14 hours ago
Everything that happened this week in open AI, a recap šŸ¤  https://huggingface.co/collections/merve/jan-17-releases-678a673a9de4a4675f215bf5 šŸ‘€ Multimodal - MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB (vision, speech and text!) - VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448 - ByteDance released larger SA2VA that comes in 26B parameters - Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance šŸ’¬ LLMs - MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens šŸ¤Æ - Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B - kyutai released Helium-1-Preview-2B is a new small multilingual LM - Wayfarer-12B is a new LLM able to write D&D šŸ§™šŸ»ā€ā™‚ļø - ReaderLM-v2 is a new HTML parsing model by Jina AI - Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder - Unsloth released Phi-4, faster and memory efficient Llama 3.3 šŸ–¼ļø Vision - MatchAnything is a new foundation model for matching - FitDit is a high-fidelity VTON model based on DiT architecture šŸ—£ļø Audio - OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities šŸ“– Retrieval - lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages - cde-small-v2 is a new sota small retrieval model by @jxm
reacted to merve's post with ā¤ļø about 14 hours ago
Everything that happened this week in open AI, a recap šŸ¤  https://huggingface.co/collections/merve/jan-17-releases-678a673a9de4a4675f215bf5 šŸ‘€ Multimodal - MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB (vision, speech and text!) - VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448 - ByteDance released larger SA2VA that comes in 26B parameters - Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance šŸ’¬ LLMs - MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens šŸ¤Æ - Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B - kyutai released Helium-1-Preview-2B is a new small multilingual LM - Wayfarer-12B is a new LLM able to write D&D šŸ§™šŸ»ā€ā™‚ļø - ReaderLM-v2 is a new HTML parsing model by Jina AI - Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder - Unsloth released Phi-4, faster and memory efficient Llama 3.3 šŸ–¼ļø Vision - MatchAnything is a new foundation model for matching - FitDit is a high-fidelity VTON model based on DiT architecture šŸ—£ļø Audio - OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities šŸ“– Retrieval - lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages - cde-small-v2 is a new sota small retrieval model by @jxm
View all activity

Organizations

OpenBMB's profile picture Rhapsody's profile picture

Cuiunbo's activity

reacted to merve's post with šŸ¤—ā¤ļø about 14 hours ago
view post
Post
601
Everything that happened this week in open AI, a recap šŸ¤  merve/jan-17-releases-678a673a9de4a4675f215bf5

šŸ‘€ Multimodal
- MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB
(vision, speech and text!)
- VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448
- ByteDance released larger SA2VA that comes in 26B parameters
- Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance

šŸ’¬ LLMs
- MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens šŸ¤Æ
- Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B
- kyutai released Helium-1-Preview-2B is a new small multilingual LM
- Wayfarer-12B is a new LLM able to write D&D šŸ§™šŸ»ā€ā™‚ļø
- ReaderLM-v2 is a new HTML parsing model by Jina AI

- Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder
- Unsloth released Phi-4, faster and memory efficient Llama 3.3

šŸ–¼ļø Vision
- MatchAnything is a new foundation model for matching
- FitDit is a high-fidelity VTON model based on DiT architecture

šŸ—£ļø Audio
- OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities

šŸ“– Retrieval
- lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages
- cde-small-v2 is a new sota small retrieval model by
@jxm
New activity in openbmb/MiniCPM-o-2_6-gguf about 14 hours ago
reacted to alibabasglab's post with šŸš€ about 14 hours ago
replied to mitkox's post about 21 hours ago
view reply

nice! Looking forward to seeing your work!

reacted to mitkox's post with šŸ‘€šŸš€ about 21 hours ago
view post
Post
1048
Training a model to reason in the continuous latent space based on Meta's Coconut.
If it all works will apply it on the MiniCPM-o SVD-LR.
Endgame is a multimodal, adaptive, and efficient foundational on device AI model.
  • 2 replies
Ā·
replied to hexgrad's post 4 days ago