You can also browse my Google Scholar profile.

2024

4 publications

Can Speech LLMs think while listening?

Ian Shih, Desh Raj, Chunyang Wu, Wei Zhou, SK Bong, Yashesh Gaur, Jay Mahadeokar, Ozlem Kalinli, Mike Seltzer
Submitted to ICLR 2026

Faster Speech-LLaMA inference with multi-token prediction

Desh Raj, Gil Keren, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli
IEEE ICASSP 2025

M-BEST-RQ: A multi-channel speech foundation model for smart glasses

Yufeng Yang, Desh Raj, Ju Lin, Niko Moritz, Junteng Jia, Gil Keren, Egor Lakomkin, Yiteng Huang, Jacob Donley, Jay Mahadeokar, Ozlem Kalinli
IEEE ICASSP 2025

Speech-N-LlaMA: Improving Speech LLMs with multi-pass training

Amit Kumar Singh Yadav, Gil Keren, Desh Raj, Wei Zhou, Junteng Jia, Ke Li, Ying Xu, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli
IEEE ICASSP 2025

2024

5 publications

ConEC: Earnings Call Dataset with Real-world Contexts for Benchmarking Contextual Speech Recognition

Ruizhe Huang, Mahsa Yarmohammadi, Jan Trmal, Jing Liu, Desh Raj, Leibny Paola Garcia, Alexei V Ivanov, Patrick Ehlen, Mingzhi Yu, Dan Povey, Sanjeev Khudanpur
LREC 2024

Listening to multi-talker conversations: Modular and end-to-end perspectives

Desh Raj
PhD Thesis, Johns Hopkins University

On speaker attribution with SURT

Desh Raj, Matthew Wiesner, Matthew Maciejewski, Paola Garcia, Daniel Povey, Sanjeev Khudanpur
Speaker Odyssey 2024

Updated corpora and benchmarks for long-form speech recognition

Jennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller, Migüel Jetté
IEEE ICASSP 2024

Training Early-Exit Architectures for Automatic Speech Recognition: Fine-Tuning Pre-Trained Models or Training from Scratch

George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, Alessio Brutti
IEEE ICASSP 2024 Workshop on Self-supervision in Audio, Speech, and Beyond (SASB)

2023

6 publications

Learning from flawed data: Weakly supervised automatic speech recognition

Dongji Gao, Hainan Xu, Desh Raj, Leibny Paola Garcia Perera, Daniel Povey, Sanjeev Khudanpur
IEEE ASRU 2023

SURT 2.0: Advances in transducer-based multi-talker speech recognition

Desh Raj, Daniel Povey, Sanjeev Khudanpur
IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP)

The CHiME-7 DASR challenge: Distant meeting transcription with multiple devices in diverse scenarios

Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur
CHiME Workshop at InterSpeech 2023

GPU-accelerated guided source separation for meeting transcription

Desh Raj, Daniel Povey, Sanjeev Khudanpur
InterSpeech 2023

Anchored speech recognition using neural transducers

Desh Raj, Junteng Jia, Jay Mahadeokar, Chunyang Wu, Niko Moritz, Xiaohui Zhang, Ozlem Kalinli
IEEE ICASSP 2023

Adapting self-supervised models to multi-talker speech recognition using speaker embeddings

Zili Huang, Desh Raj, Paola Garcia, Sanjeev Khudanpur
IEEE ICASSP 2023

2022

3 publications

Low-Latency speech separation guided diarization for telephone conversations

Giovanni Morrone, Samuele Cornell, Desh Raj, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini
IEEE Spoken Language Technology (SLT) Workshop 2022

Continuous streaming multi-talker ASR with dual-path transducers

Desh Raj, Liang Lu, Zhuo Chen, Yashesh Gaur, Jinyu Li
IEEE ICASSP 2022

Injecting text and cross-lingual supervision in few-shot learning from self-supervised models

Matthew Wiesner, Desh Raj, Sanjeev Khudanpur
IEEE ICASSP 2022

2021

10 publications

Joint speaker diarization and speech recognition based on region proposal networks

Zili Huang, Marc Delcroix, Leibny Paola Garcia, Shinji Watanabe, Desh Raj, Sanjeev Khudanpur
Computer, Speech, and Language, Vol. 72

Reformulating DOVER-Lap label mapping as a graph partitioning problem

Desh Raj, Sanjeev Khudanpur
INTERSPEECH 2021

Auxiliary loss function for target speech extraction and recognition with weak supervision based on speaker characteristics

Katerina Zmolikova, Marc Delcroix, Desh Raj, Shinji Watanabe, Jan Černocký
INTERSPEECH 2021

Multi-class spectral clustering with overlaps for speaker diarization

Desh Raj, Zili Huang, Sanjeev Khudanpur
IEEE Spoken Language Technology (SLT) Workshop 2021

DOVER-Lap: A method for combining overlap-aware diarization outputs

Desh Raj, Paola Garcia, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur
IEEE Spoken Language Technology (SLT) Workshop 2021

Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis

Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Maokui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey
IEEE Spoken Language Technology (SLT) Workshop 2021

Sequential multi-frame neural beamforming for speech separation and enhancement

Zhong-Qiu Wang, Hakan Erdogan, Scott Wisdom, Kevin Wilson, Desh Raj, Shinji Watanabe, Zhuo Chen, John R. Hershey
IEEE Spoken Language Technology (SLT) Workshop 2021

2020

2 publications

Frustratingly easy noise-aware training of acoustic models

Desh Raj, Jesus Villalba, Daniel Povey, Sanjeev Khudanpur
ArXiv, 2020

The JHU multi-microphone multi-speaker ASR system for the CHiME-6 challenge

Ashish Arora*, Desh Raj*, Aswin Shanmugam Subramanian*, Ke Li*, Bar Benyair, Matthew Maciejewski, Piotr Zelasko, Paola Garcia, Shinji Watanabe, Sanjeev Khudanpur
The 6th CHiME Workshop (at ICASSP 2020)

2019

2 publications

Probing the information encoded in x-vectors

Desh Raj, David Snyder, Daniel Povey, Sanjeev Khudanpur
IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2019

Using ASR methods for OCR

Ashish Arora, Chun Chieh Chang, Babak Rekabdar, Daniel Povey, David Etter, Desh Raj, Hossein Hadian, Jan Trmal, Paola Garcia, Shinji Watanabe, Vimal Manohar, Yiwen Shao, Sanjeev Khudanpur
International Conference on Document Analysis and Recognition (ICDAR) 2019

2018

1 publication

Uncertain fuzzy self-organization based clustering: interval type-2 approach to adaptive resonance theory

Shakaiba Majheed, Aditya Gupta, Desh Raj, Frank Chung-hoon Rhee
Information Sciences, 2018

2017

2 publications

Learning local and global contexts using a convolutional recurrent neural network for relation classification in biomedical text

Desh Raj, Sunil Kumar Sahu, Ashish Anand
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL) 2017

Analysis of data generated from multidimensional type-1 and type-2 fuzzy membership functions

Desh Raj, Aditya Gupta, Bhuvnesh Garg, Kenil Tanna, Frank Chung-hoon Rhee
IEEE Transactions on Fuzzy Systems, 2017