CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models Paper โข 2506.07463 โข Published 2 days ago โข 8
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper โข 2505.07608 โข Published 30 days ago โข 79
view post Post 20892 Google drops Gemini 2.0 Flash Thinkinga new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and morenow available in anychat, try it out: https://huggingface.co/spaces/akhaliq/anychat See translation 4 replies ยท ๐ 10 10 ๐ฅ 5 5 ๐ 3 3 ๐ 2 2 + Reply
view post Post 20649 QwQ-32B-Preview is now available in anychatA reasoning model that is competitive with OpenAI o1-mini and o1-previewtry it out: https://huggingface.co/spaces/akhaliq/anychat See translation 1 reply ยท โค๏ธ 3 3 ๐ 2 2 + Reply
view post Post 4644 New model drop in anychatallenai/Llama-3.1-Tulu-3-8B is now availabletry it here: https://huggingface.co/spaces/akhaliq/anychat See translation ๐ฅ 3 3 ๐ 1 1 + Reply
view post Post 3491 anychatsupports chatgpt, gemini, perplexity, claude, meta llama, grok all in one apptry it out there: https://huggingface.co/spaces/akhaliq/anychat โค๏ธ 7 7 ๐ 4 4 ๐ฅ 2 2 + Reply
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies Paper โข 2408.06567 โข Published Aug 13, 2024 โข 2
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models Paper โข 2410.18505 โข Published Oct 24, 2024 โข 11
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper โข 2410.18558 โข Published Oct 24, 2024 โข 20
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper โข 2410.18558 โข Published Oct 24, 2024 โข 20
TACO: Topics in Algorithmic COde generation dataset Paper โข 2312.14852 โข Published Dec 22, 2023 โข 4
UniTabE: A Universal Pretraining Protocol for Tabular Foundation Model in Data Science Paper โข 2307.09249 โข Published Jul 18, 2023
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation Paper โข 2401.13560 โข Published Jan 24, 2024 โข 1
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning Paper โข 2401.14011 โข Published Jan 25, 2024