arXiv cs.SD Sound
cssd-bot.bsky.social
arXiv cs.SD Sound
@cssd-bot.bsky.social
[2025-12-31 Wed (UTC), no new articles found for csSD Sound]
December 31, 2025 at 6:39 AM
Reposted by arXiv cs.SD Sound
Deepak Babu Piskala: PROFASR-BENCH: A Benchmark for Context-Conditioned ASR in High-Stakes Professional Speech https://arxiv.org/abs/2512.23686 https://arxiv.org/pdf/2512.23686 https://arxiv.org/html/2512.23686
December 30, 2025 at 6:30 AM
Reposted by arXiv cs.SD Sound
Yu-Xiang Lin, Cheng-Han Chiang, Hung-yi Lee: Style Amnesia: Investigating Speaking Style Degradation and Mitigation in Multi-Turn Spoken Language Models https://arxiv.org/abs/2512.23578 https://arxiv.org/pdf/2512.23578 https://arxiv.org/html/2512.23578
December 30, 2025 at 6:30 AM
Reposted by arXiv cs.SD Sound
I\c{s}{\i}k, I\c{s}{\i}k, I\c{s}{\i}k, Taylan: Geometry-Aware Optimization for Respiratory Sound Classification: Enhancing Sensitivity with SAM-Optimized Audio Spectrogram Transformers https://arxiv.org/abs/2512.22564 https://arxiv.org/pdf/2512.22564 https://arxiv.org/html/2512.22564
December 30, 2025 at 6:35 AM
Reposted by arXiv cs.SD Sound
Hanbeot Park, Yunjeong Cho, Hunhee Kim: EEG-to-Voice Decoding of Spoken and Imagined speech Using Non-Invasive EEG https://arxiv.org/abs/2512.22146 https://arxiv.org/pdf/2512.22146 https://arxiv.org/html/2512.22146
December 30, 2025 at 6:35 AM
Saifelden M. Ismail: Mobile-Efficient Speech Emotion Recognition Using DistilHuBERT: A Cross-Corpus Validation Study https://arxiv.org/abs/2512.23435 https://arxiv.org/pdf/2512.23435 https://arxiv.org/html/2512.23435
December 30, 2025 at 6:34 AM
December 30, 2025 at 6:34 AM
HaeChun Chung: AudioGAN: A Compact and Efficient Framework for Real-Time High-Fidelity Text-to-Audio Generation https://arxiv.org/abs/2512.22166 https://arxiv.org/pdf/2512.22166 https://arxiv.org/html/2512.22166
December 30, 2025 at 6:34 AM
Ni, Yang, Tian, Li, Lyu, Du, Wang, Luo, Zhang: Marco-ASR: A Principled and Metric-Driven Framework for Fine-Tuning Large-Scale ASR Models for Domain Adaptation https://arxiv.org/abs/2512.22165 https://arxiv.org/pdf/2512.22165 https://arxiv.org/html/2512.22165
December 30, 2025 at 6:34 AM
Jin Sob Kim, Hyun Joon Park, Wooseok Shin, Sung Won Han: A Robust framework for sound event localization and detection on real recordings https://arxiv.org/abs/2512.22156 https://arxiv.org/pdf/2512.22156 https://arxiv.org/html/2512.22156
December 30, 2025 at 6:34 AM
Jin Sob Kim, Hyun Joon Park, Wooseok Shin, Sung Won Han: Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification https://arxiv.org/abs/2512.22148 https://arxiv.org/pdf/2512.22148 https://arxiv.org/html/2512.22148
December 30, 2025 at 6:34 AM
[2025-12-30 Tue (UTC), 6 new articles found for csSD Sound]
December 30, 2025 at 6:34 AM
Reposted by arXiv cs.SD Sound
Ruihao Jing, Cheng Gong, Yu Jiang, Boyu Zhu, Shansong Liu, Chi Zhang, Xiao-Lei Zhang, Xuelong Li: Rare Word Recognition and Translation Without Fine-Tuning via Task Vector in Speech Models https://arxiv.org/abs/2512.21894 https://arxiv.org/pdf/2512.21894 https://arxiv.org/html/2512.21894
December 29, 2025 at 6:35 AM
Most. Sharmin Sultana Samu, Md. Rakibul Islam, Md. Zahid Hossain, Md. Kamrozzaman Bhuiyan, Farhad Uz Zaman: Zero-Shot to Zero-Lies: Detecting Bengali Deepfake Audio through Transfer Learning https://arxiv.org/abs/2512.21702 https://arxiv.org/pdf/2512.21702 https://arxiv.org/html/2512.21702
December 29, 2025 at 6:34 AM
Liuyang Bai, Weiyi Lu, Li Guo: Semantic Codebooks as Effective Priors for Neural Speech Compression https://arxiv.org/abs/2512.21653 https://arxiv.org/pdf/2512.21653 https://arxiv.org/html/2512.21653
December 29, 2025 at 6:34 AM
[2025-12-29 Mon (UTC), 2 new articles found for csSD Sound]
December 29, 2025 at 6:34 AM
[2025-12-26 Fri (UTC), no new articles found for csSD Sound]
December 26, 2025 at 6:39 AM
Reposted by arXiv cs.SD Sound
Zhongren Dong, Haotian Guo, Weixiang Xu, Huan Zhao, Zixing Zhang: Foundation Model-based Evaluation of Neuropsychiatric Disorders: A Lifespan-Inclusive, Multi-Modal, and Multi-Lingual Study https://arxiv.org/abs/2512.20948 https://arxiv.org/pdf/2512.20948 https://arxiv.org/html/2512.20948
December 25, 2025 at 6:30 AM
Wan Ki Wong, Ka Ho To, Chuck-jee Chau, Lucas Wong, Kevin Y. Yip, Irwin King: Towards Practical Automatic Piano Reduction using BERT with Semi-supervised Learning https://arxiv.org/abs/2512.21324 https://arxiv.org/pdf/2512.21324 https://arxiv.org/html/2512.21324
December 25, 2025 at 6:34 AM
Zhongren Dong, Bin Wang, Jing Han, Haotian Guo, Xiaojun Mo, Yimin Cao, Zixing Zhang: SACodec: Asymmetric Quantization with Semantic Anchoring for Low-Bitrate High-Fidelity Neural Speech Codecs https://arxiv.org/abs/2512.20944 https://arxiv.org/pdf/2512.20944 https://arxiv.org/html/2512.20944
December 25, 2025 at 6:34 AM
[2025-12-25 Thu (UTC), 2 new articles found for csSD Sound]
December 25, 2025 at 6:34 AM
Reposted by arXiv cs.SD Sound
Poli, Luthra, Benchekroun, Higuchi, Gleize, Shen, Algayres, Chung, Assran, Pino, Dupoux: SpidR: Learning Fast and Stable Linguistic Units for Spoken Language Models Without Supervision https://arxiv.org/abs/2512.20308 https://arxiv.org/pdf/2512.20308 https://arxiv.org/html/2512.20308
December 24, 2025 at 6:30 AM
Reposted by arXiv cs.SD Sound
Qian Chen, Luyao Cheng, Chong Deng, Xiangang Li, Jiaqing Liu, Chao-Hong Tan, Wen Wang, Junhao Xu, Jieping Ye, Qinglin Zhang, Qiquan Zhang, Jingren Zhou: Fun-Audio-Chat Technical Report https://arxiv.org/abs/2512.20156 https://arxiv.org/pdf/2512.20156 https://arxiv.org/html/2512.20156
December 24, 2025 at 6:29 AM
Reposted by arXiv cs.SD Sound
Chengwei Liu, Haoyin Yan, Shaofei Xue, Xiaotao Liang, Xiaofu Chen, Bin Gong, Zheng Xue, Gang Song: QuarkAudio Technical Report https://arxiv.org/abs/2512.20151 https://arxiv.org/pdf/2512.20151 https://arxiv.org/html/2512.20151
December 24, 2025 at 6:35 AM
Reposted by arXiv cs.SD Sound
Tian, Du, Zhang, Wang, Lee, Bai, Zhu, Niu, Tang: DDAVS: Disentangled Audio Semantics and Delayed Bidirectional Alignment for Audio-Visual Segmentation https://arxiv.org/abs/2512.20117 https://arxiv.org/pdf/2512.20117 https://arxiv.org/html/2512.20117
December 24, 2025 at 6:30 AM