Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 16 days ago • 51
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • Nov 13, 2024 • 98
view article Article OCR Processing and Text in Image Analysis with Florence-2-base and Qwen2-VL-2B By PandorAI1995 • Oct 18, 2024 • 14