gerald hewes's picture

gerald hewes

gerald29

·

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

internlm/internlm3-8b-instruct

updated a model 4 days ago

gerald29/plants-usda

upvoted a paper 5 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

View all activity

Organizations

None yet

gerald29's activity

upvoted a paper 5 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 8 days ago • 55

upvoted a collection 9 days ago

Sa2VA model zoo

4 items • Updated 3 days ago • 24

upvoted a paper about 1 month ago

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 124

upvoted a collection about 1 month ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 128

upvoted a paper about 1 month ago

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Paper • 2412.04424 • Published Dec 5, 2024 • 59

upvoted a collection about 1 month ago

🤖 Agents

21 items • Updated 18 days ago • 105

upvoted a paper about 2 months ago

CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models

Paper • 2411.18613 • Published Nov 27, 2024 • 50

upvoted 2 papers 2 months ago

A Case Study of Web App Coding with OpenAI Reasoning Models

Paper • 2409.13773 • Published Sep 19, 2024 • 6

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Paper • 2411.02397 • Published Nov 4, 2024 • 23

upvoted a collection 3 months ago

LongVU

7 items • Updated Oct 31, 2024 • 28

upvoted 2 papers 3 months ago

Framer: Interactive Frame Interpolation

Paper • 2410.18978 • Published Oct 24, 2024 • 36

STaR: Bootstrapping Reasoning With Reasoning

Paper • 2203.14465 • Published Mar 28, 2022 • 8

upvoted 3 papers 4 months ago

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3, 2024 • 36

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 40

Were RNNs All We Needed?

Paper • 2410.01201 • Published Oct 2, 2024 • 51

upvoted a collection 4 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated 12 days ago • 293

upvoted a paper 4 months ago

Improve Mathematical Reasoning in Language Models by Automated Process Supervision

Paper • 2406.06592 • Published Jun 5, 2024 • 27

upvoted a collection 4 months ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated 3 days ago • 101

upvoted a paper 5 months ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 98

upvoted a paper 6 months ago

SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization

Paper • 2407.14257 • Published Jul 19, 2024 • 5