Papers
arxiv:2503.22262

Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion

Published on Mar 28
Authors:
,
,
,
,
,
,
,

Abstract

With the rapid proliferation of 3D devices and the shortage of 3D content, stereo conversion is attracting increasing attention. Recent works introduce pretrained Diffusion Models (DMs) into this task. However, due to the scarcity of large-scale training data and comprehensive benchmarks, the optimal methodologies for employing DMs in stereo conversion and the accurate evaluation of stereo effects remain largely unexplored. In this work, we introduce the Mono2Stereo dataset, providing high-quality training data and benchmark to support in-depth exploration of stereo conversion. With this dataset, we conduct an empirical study that yields two primary findings. 1) The differences between the left and right views are subtle, yet existing metrics consider overall pixels, failing to concentrate on regions critical to stereo effects. 2) Mainstream methods adopt either one-stage left-to-right generation or warp-and-inpaint pipeline, facing challenges of degraded stereo effect and image distortion respectively. Based on these findings, we introduce a new evaluation metric, Stereo Intersection-over-Union, which prioritizes disparity and achieves a high correlation with human judgments on stereo effect. Moreover, we propose a strong baseline model, harmonizing the stereo effect and image quality simultaneously, and notably surpassing current mainstream methods. Our code and data will be open-sourced to promote further research in stereo conversion. Our models are available at mono2stereo-bench.github.io.

Community

teaser4.4.png

Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion

Key Contributions:

• Weconstruct Mono2Stereo, a large-scale benchmark designed for high-quality stereo conversion. This benchmark encompasses three key dimensions to facilitate a comprehensive evaluation of such methods.

• We introduce Stereo Intersection-over-Union (SIoU), a novel and pioneering evaluation metric designed to as
sess the prominence of stereoscopic effects in stereo pairs. This metric effectively complements existing evaluation
metrics for a thorough assessment.

• Through extensive experiments, we establish a strong baseline model for stereo conversion. Benefiting from dual conditioning and Edge Consistency loss, our model achieves both compelling image quality and convincing stereo effects.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2503.22262 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2503.22262 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2503.22262 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.