SpaceOm / README.md
salma-remyx's picture
Update README.md
7ef23be verified
metadata
task_categories:
  - visual-question-answering
language:
  - en
tags:
  - remyx
  - SpatialReasoning
  - spatial-reasoning
  - test-time-compute
  - thinking
  - reasoning
  - multimodal
  - vlm
  - vision-language
  - distance-estimation
  - quantitative-spatial-reasoning
pretty_name: SpaceOm
license: apache-2.0

SpaceOm (Coming Soon)

image/gif

Model Overview

OpenAI's plan to release a SOTA text-in, text-out toggleable reasoning LLM means the most performant Vision-Language Model (VLM) will likely be based on this llm backbone.

Meanwhile, updated methods of reasoning synthesis which include improvements to localization & captioning using "Describe Anything" as well as the step-by-step instructions are in the works.

Check out SpaceThinker for more on the cutting-edge of quantitative spatial reasoning.