OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Abstract
OneIG-Bench is a comprehensive benchmark framework for evaluating text-to-image models across multiple dimensions including reasoning, text rendering, and diversity.
Text-to-image (T2I) models have garnered significant attention for generating high-quality images aligned with text prompts. However, rapid T2I model advancements reveal limitations in early benchmarks, lacking comprehensive evaluations, for example, the evaluation on reasoning, text rendering and style. Notably, recent state-of-the-art models, with their rich knowledge modeling capabilities, show promising results on the image generation problems requiring strong reasoning ability, yet existing evaluation systems have not adequately addressed this frontier. To systematically address these gaps, we introduce OneIG-Bench, a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models across multiple dimensions, including prompt-image alignment, text rendering precision, reasoning-generated content, stylization, and diversity. By structuring the evaluation, this benchmark enables in-depth analysis of model performance, helping researchers and practitioners pinpoint strengths and bottlenecks in the full pipeline of image generation. Specifically, OneIG-Bench enables flexible evaluation by allowing users to focus on a particular evaluation subset. Instead of generating images for the entire set of prompts, users can generate images only for the prompts associated with the selected dimension and complete the corresponding evaluation accordingly. Our codebase and dataset are now publicly available to facilitate reproducible evaluation studies and cross-model comparisons within the T2I research community.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- TIIF-Bench: How Does Your T2I Model Follow Your Instructions? (2025)
- MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models (2025)
- DetailMaster: Can Your Text-to-Image Model Handle Long Prompts? (2025)
- STRICT: Stress Test of Rendering Images Containing Text (2025)
- Align Beyond Prompts: Evaluating World Knowledge Alignment in Text-to-Image Generation (2025)
- Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation (2025)
- OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper