Xinyi Bai's picture

4

Xinyi Bai

wwwbxy123

·

AI & ML interests

LLM, GenAI, Speech, SVC

Recent Activity

authored a paper 14 days ago

ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations

authored a paper 14 days ago

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

upvoted a paper 14 days ago

MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark

View all activity

Organizations

authored 2 papers 14 days ago

ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations

Paper • 2502.10999 • Published Feb 16

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7 • 60

upvoted a paper 14 days ago

MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark

Paper • 2409.18216 • Published Sep 26, 2024 • 1

authored 3 papers 9 months ago

Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm

Paper • 2409.07226 • Published Sep 11, 2024 • 1

Towards Rationality in Language and Multimodal Agents: A Survey

Paper • 2406.00252 • Published Jun 1, 2024

MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark

Paper • 2409.18216 • Published Sep 26, 2024 • 1

upvoted 3 papers over 1 year ago

Bio-Inspired Night Image Enhancement Based on Contrast Enhancement and Denoising

Paper • 2307.05447 • Published Jul 11, 2023 • 2

Faithful Persona-based Conversational Dataset Generation with Large Language Models

Paper • 2312.10007 • Published Dec 15, 2023 • 9

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2

Paper • 2401.17619 • Published Jan 31, 2024 • 1

authored 3 papers over 1 year ago

Bio-Inspired Night Image Enhancement Based on Contrast Enhancement and Denoising

Paper • 2307.05447 • Published Jul 11, 2023 • 2

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2

Paper • 2401.17619 • Published Jan 31, 2024 • 1

Faithful Persona-based Conversational Dataset Generation with Large Language Models

Paper • 2312.10007 • Published Dec 15, 2023 • 9