AI & ML interests

Evaluating open LLMs

Recent Activity

AdinaYΒ 
posted an update 3 days ago
AdinaYΒ 
posted an update 3 days ago
view post
Post
202
Klear-46B-A2.5πŸ”₯ a sparse MoE LLM developed by the Kwai-Klear Team at Kuaishou

Kwai-Klear/klear10-68ba61398a0a4eb392ec6ab1

✨ 46B total / 2.5B active - Apache2.0
✨ Dense-level performance at lower cost
✨ Trained on 22T tokens with progressive curriculum
✨ 64K context length
  • 1 reply
Β·
AdinaYΒ 
posted an update 7 days ago
view post
Post
294
πŸ”₯ August highlights from Chinese AI community

zh-ai-community/august-2025-china-open-source-highlights-68a2de5630f406edaf320e88

✨ Efficiency leads the month
- At scale: optimizing compute use in massive MoE models e.g. DeepSeek v3.1
- In small models: lightweight & deployable
e.g. MiniCPM V 4.5, Step Audio 2-mini, Intern S1-mini,Ovis2.5-9B etc.

✨ Reasoning + Agentic wave 🌊 Not just demos, but real product use cases.
- Meituan, DeepSeek: large-scale models tuned for reasoning & tools
- Qwen, GLM, InternLM: multimodal reasoning + agentic interaction
- CodeAgent, Prover, Baichuan-M2-32B: domain-focused (coding, logic, specialized reasoning)

✨ Open source is exploding across all types of companies!!
- Big tech: Tencent, ByteDance, Xiaomi, Kuaishou, Alibaba/Qwen, Skywork, Ant Group
- Startups: DeepSeek (yes, still a startup!), Zhipu, Baichuan, StepFun, OpenBMB
- New entrants: Meituan, RedNote
- Research labs: Shanghai AI Lab (InternLM, OpenGVLab)

✨ Open source was explicitly mentioned in the State Council’s new guidance on deepening the "AI+" strategy.
- Open-source: support communities, encourage contributions (incl. university credits & recognition), foster new application approaches, and build globally impactful ecosystems πŸ‘€

πŸ’‘ The Chinese community didn’t slow down at all in August 🀯 September, the last month before the Golden Week holiday, may bring even more surprises.

Stay Tuned!
AdinaYΒ 
posted an update 7 days ago
view post
Post
269
Hunyuan-MT-7B πŸ”₯ open translation model released by Tencent Hunyuan

tencent/hunyuan-mt-68b42f76d473f82798882597

✨ Supports 33 languages, including 5 ethnic minority languages in China πŸ‘€
✨ Including a translation ensemble model: Chimera-7B
✨ Full pipeline: pretrain > CPT > SFT > enhancement > ensemble refinement > SOTA performance at similar scale
AdinaYΒ 
posted an update 7 days ago
view post
Post
240
From food delivery to frontier AI πŸš€ Meituan, the leading lifestyle platform just dropped its first open SoTA LLM: LongCat-Flash πŸ”₯

meituan-longcat/LongCat-Flash-Chat

✨ 560B total / ~27B active MoE β€” MIT license
✨ 128k context length + advanced reasoning
✨ ScMoE design: 100+ TPS inference
✨ Stable large-scale training + strong agentic performance
AdinaYΒ 
posted an update 10 days ago
view post
Post
521
USO 🎨 Unified customization model released by Bytedance research

Demo
bytedance-research/USO
Model
bytedance-research/USO
Paper
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning (2508.18966)

✨ Large-scale triplet dataset (content, style, stylized)
✨ Disentangled learning: style alignment + content preservation
✨ Style Reward Learning (SRL) for higher fidelity
✨ USO-Bench: 1st benchmark for style & subject jointly
✨ SOTA results on subject consistency & style similarity
AdinaYΒ 
posted an update 10 days ago
view post
Post
409
Step-Audio 2πŸ”₯ New end to end multimodal LLM for audio & speech, released by StepFun

stepfun-ai/step-audio-2-68b003c3a47b273fffaf67a8

✨ Direct raw audio: text & speech ,no ASR+LLM+TTS pipeline
✨ High-IQ reasoning: RL + CoT for paralinguistic cues
✨ Multimodal RAG + tool calling
✨ Emotion, timbre, dialect & style control
✨ SOTA on ASR, paralinguistic, speech dialog
AdinaYΒ 
posted an update 13 days ago
view post
Post
1091
πŸ‡¨πŸ‡³ China’s State Council just released its β€œAI+” Action Plan (2025)

<The State Council’s Guidance on Deepened Implementation of the β€˜AI+’ Strategy>
zh-ai-community/china-ai-policy-research

✨Goal: By 2035, AI will deeply empower all sectors, reshape productivity & society

✨Focus on 6 pillars:
>Science & Tech
>Industry
>Consumption
>Public welfare
>Governance
>Global cooperation

✨Highlights:
>Models: advance theory, efficient training/inference, evaluation system
>Data: high-quality datasets, IP/copyright reform, new incentives
>Compute: boost chips & clusters, improve national network, promote cloud standardization, and ensure inclusive, efficient, green, secure supply.
>Applications: AI-as-a-service, test bases, new standards
>Open-source: support communities, encourage contributions (incl. university credits & recognition), foster new application approaches, and build globally impactful ecosystems πŸ‘€
>Talent, policy & safety frameworks to secure sustainable growth
AdinaYΒ 
posted an update 13 days ago
view post
Post
4880
MiniCPM-V 4.5 πŸš€ New MLLM for image, multi-image & video understanding, running even on your phone, released by OpenBMB

openbmb/MiniCPM-V-4_5

✨ SOTA vision language capability
✨ 96Γ— video token compression > high-FPS & long video reasoning
✨ Switchable fast vs deep thinking modes
✨ Strong OCR, document parsing, supports 30+ languages
AdinaYΒ 
posted an update 13 days ago
view post
Post
296
InternVL3.5 πŸ”₯ New family of multimodal model by Shanghai AI lab

OpenGVLab/internvl35-68ac87bd52ebe953485927fb

✨ 1B Β· 2B Β· 4B Β· 8B Β· 14B Β· 38B | MoE β†’ 20B-A4B Β· 30B-A3B Β· 241B-A28B πŸ“„Apache 2.0
✨ +16% reasoning performance, 4.05Γ— speedup vs InternVL3
✨ Cascade RL (offline + online) : stronger reasoning
✨ ViR: efficient visual token routing
✨ DvD: calable vision–language deployment
✨ Supports GUI & embodied agency πŸ€–
AdinaYΒ 
posted an update 18 days ago
AdinaYΒ 
posted an update 18 days ago
view post
Post
3631
Seed-OSS πŸ”₯ The latest open LLM from Bytedance Seed team

ByteDance-Seed/seed-oss-68a609f4201e788db05b5dcd

✨ 36B - Base & Instruct
✨ Apache 2.0
✨ Native 512K long context
✨ Strong reasoning & agentic intelligence
✨ 2 Base versions: with & without synthetic data
AdinaYΒ 
posted an update 19 days ago
AdinaYΒ 
posted an update 20 days ago
view post
Post
487
Before my vacation: Qwen releasing.
When I came back: Qwen still releasing
Respect!!🫑

Meet Qwen Image Edit πŸ”₯ the image editing version of Qwen-Image by
@Alibaba_Qwen

Qwen/Qwen-Image-Edit

✨ Apache 2.0
✨ Semantic + Appearance Editing: rotate, restyle, add/remove 🎨
✨ Precise Text Editing β†’ edit CN/EN text, keep style

Create README.md

1
#33 opened 26 days ago by
MonsterDo000
albertvillanovaΒ 
posted an update 27 days ago
view post
Post
3299
Latest smolagents release supports GPT-5: build agents that think, plan, and act.
⚑ Upgrade now and put GPT-5 to work!
megΒ 
posted an update 27 days ago
albertvillanovaΒ 
posted an update 28 days ago
view post
Post
462
πŸš€ smolagents v1.21.0 is here!
Now with improved safety in the local Python executor: dunder calls are blocked!
⚠️ Still, not fully isolated: for untrusted code, use a remote executor instead: Docker, E2B, Wasm.
✨ Many bug fixes: more reliable code.
πŸ‘‰ https://github.com/huggingface/smolagents/releases/tag/v1.21.0