Improved Long-Form Speech Recognition by Jointly Modeling the Primary and Non-primary Speakers Paper β’ 2312.11123 β’ Published Dec 18, 2023
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation Paper β’ 2201.03713 β’ Published Jan 11, 2022
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models Paper β’ 2401.03506 β’ Published Jan 7, 2024 β’ 13 β’ 4
SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System Paper β’ 2104.02125 β’ Published Apr 5, 2021
Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech Paper β’ 2202.12163 β’ Published Feb 24, 2022
Running on CPU Upgrade 609 609 Open ASR Leaderboard π Request evaluation results for a speech model
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models Paper β’ 2401.03506 β’ Published Jan 7, 2024 β’ 13