Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
merveΒ 
posted an update 15 days ago
Post
3028
Interesting releases in open AI this week, let's recap 🀠 merve/feb-7-releases-67a5f7d7f172d8bfe0dd66f4

πŸ€– Robotics
> Pi0, first open-source foundation vision-language action model was released in Le Robot (Apache 2.0)

πŸ’¬ LLMs
> Groundbreaking: s1 is simpler approach to test-time scaling, the release comes with small s1K dataset of 1k question-reasoning trace pairs (from Gemini-Thinking Exp) they fine-tune Qwen2.5-32B-Instruct to get s1-32B, outperforming o1-preview on math 🀯 s1-32B and s1K is out!
> Adyen released DABstep, a new benchmark along with it's leaderboard demo for agents doing data analysis
> Krutrim released Krutrim-2 instruct, new 12B model based on NeMo12B trained and aligned on Indic languages, a new multilingual sentence embedding model (based on STSB-XLM-R), and a translation model for Indic languages

πŸ‘€ Multimodal
> PKU released Align-DS-V, a model aligned using their new technique called LLF for all modalities (image-text-audio), along with the dataset Align Anything
> OLA-7B is a new any-to-any model by Tencent that can take text, image, video, audio data with context window of 32k tokens and output text and speech in English and Chinese
> Krutrim released Chitrarth, a new vision language model for Indic languages and English

πŸ–ΌοΈ Vision
> BiRefNet_HR is a new higher resolution BiRefNet for background removal

πŸ—£οΈ Audio
> kyutai released Hibiki, it's a real-time speech-to-speech translation model 🀯 it's available for French-English translation
> Krutrim released Dhwani, a new STT model for Indic languages
> They also release a new dataset for STT-TTS

πŸ–ΌοΈ Image Generation
> Lumina released Lumina-Image-2.0, a 2B parameter-flow based DiT for text to image generation
> Tencent released Hunyuan3D-2, a 3D asset generation model based on DiT and Hunyuan3D-Paint
> boreal-hl-v1 is a new boring photorealistic image generation LoRA based on Hunyuan
In this post