6 49 25

Frank Sommers PRO

fsommers

fsommers

AI & ML interests

None yet

Recent Activity

liked a Space 2 days ago

Qwen/Qwen2.5-VL-72B-Instruct

liked a model 2 days ago

Qwen/Qwen2.5-VL-7B-Instruct

updated a collection 2 days ago

Misc papers

View all activity

Articles

Document Similarity Search with ColPali

Sep 21, 2024

• 49

Organizations

fsommers's activity

upvoted a paper 2 days ago

Question Answering on Patient Medical Records with Private Fine-Tuned LLMs

Paper • 2501.13687 • Published 7 days ago • 7

upvoted a paper 3 days ago

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 97

upvoted an article 5 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

7 days ago

• 90

upvoted a paper 19 days ago

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model

Paper • 2501.05122 • Published 21 days ago • 18

upvoted 2 papers about 1 month ago

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Paper • 2410.12628 • Published Oct 16, 2024 • 30

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83

upvoted a collection about 1 month ago

multilingual vision models

Collection

Some papers I read for understanding vision models and also adding multilingual capabilities to them • 14 items • Updated Dec 11, 2024 • 2

upvoted a paper about 1 month ago

Maya: An Instruction Finetuned Multilingual Multimodal Model

Paper • 2412.07112 • Published Dec 10, 2024 • 27

upvoted 2 papers about 2 months ago

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Paper • 2412.04424 • Published Dec 5, 2024 • 59

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 124

upvoted a collection 2 months ago

PathummaLLM-1.0.0

Collection

Multimodal LLM for Thai. • 3 items • Updated Oct 24, 2024 • 7

upvoted an article 2 months ago

Article

Enjoy the Power of Phi-3 with ONNX Runtime on your device

•

May 22, 2024

• 25

upvoted an article 3 months ago

Article

Visually Multilingual: Introducing mcdse-2b

•

Oct 27, 2024

• 38

upvoted 3 papers 3 months ago

A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 40

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published Oct 28, 2024 • 30

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21, 2024 • 44

upvoted a paper 4 months ago

From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning

Paper • 2410.06456 • Published Oct 9, 2024 • 36

upvoted 3 articles 4 months ago

Article

Deploying Your FastAPI Applications on Huggingface Via Docker

•

Dec 11, 2023

• 19

Article

Hosting your Models and Datasets on Hugging Face Spaces using Streamlit

Oct 5, 2021

• 3

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

• 182