arXiv eess.AS Audio and Speech Processing
eessas-bot.bsky.social
arXiv eess.AS Audio and Speech Processing
@eessas-bot.bsky.social
Reposted by arXiv eess.AS Audio and Speech Processing
Seyed Ali Ghazi Asgar, Narasimha Reddy: QuietPrint: Protecting 3D Printers Against Acoustic Side-Channel Attacks https://arxiv.org/abs/2602.02198 https://arxiv.org/pdf/2602.02198 https://arxiv.org/html/2602.02198
February 3, 2026 at 6:30 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Yang, Zhao, Kang, Li, He, Liu, Zhang, Qu, Peng, Wang: Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition https://arxiv.org/abs/2602.01547 https://arxiv.org/pdf/2602.01547 https://arxiv.org/html/2602.01547
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Mari\"ette Olijslager, Seyed Sahand Mohammadi Ziabari, Ali Mohammed Mansoor Alsahag: Causally Disentangled Contrastive Learning for Multilingual Speaker Embeddings https://arxiv.org/abs/2602.01363 https://arxiv.org/pdf/2602.01363 https://arxiv.org/html/2602.01363
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Chengyuan Ma, Peng Jia, Hongyue Guo, Wenming Yang: TLDiffGAN: A Latent Diffusion-GAN Framework with Temporal Information Fusion for Anomalous Sound Detection https://arxiv.org/abs/2602.01060 https://arxiv.org/pdf/2602.01060 https://arxiv.org/html/2602.01060
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Zhili Nicholas Liang, Soyeon Caren Han, Qizhou Wang, Christopher Leckie: HierCon: Hierarchical Contrastive Attention for Audio Deepfake Detection https://arxiv.org/abs/2602.01032 https://arxiv.org/pdf/2602.01032 https://arxiv.org/html/2602.01032
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Wei, Liao, Chang, Huang, Chen: Bias in the Ear of the Listener: Assessing Sensitivity in Audio Language Models Across Linguistic, Demographic, and Positional Variations https://arxiv.org/abs/2602.01030 https://arxiv.org/pdf/2602.01030 https://arxiv.org/html/2602.01030
February 3, 2026 at 6:30 AM
Reposted by arXiv eess.AS Audio and Speech Processing
V\'ictor Yeste, Rodrigo Rivas-Ar\'evalo: A Baseline Multimodal Approach to Emotion Recognition in Conversations https://arxiv.org/abs/2602.00914 https://arxiv.org/pdf/2602.00914 https://arxiv.org/html/2602.00914
February 3, 2026 at 6:30 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Ayuto Tsutsumi, Kohei Tanaka, Sayaka Shiota: The TMU System for the XACLE Challenge: Training Large Audio Language Models with CLAP Pseudo-Labels https://arxiv.org/abs/2602.00604 https://arxiv.org/pdf/2602.00604 https://arxiv.org/html/2602.00604
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Ke Xue, Rongfei Fan, Kai Li, Shanping Yu, Puning Zhao, Jianping An: Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy https://arxiv.org/abs/2602.00568 https://arxiv.org/pdf/2602.00568 https://arxiv.org/html/2602.00568
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Yong Ren, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Tao Wang: Edit Content, Preserve Acoustics: Imperceptible Text-Based Speech Editing via Self-Consistency Rewards https://arxiv.org/abs/2602.00560 https://arxiv.org/pdf/2602.00560 https://arxiv.org/html/2602.00560
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Xinting Liao, Ruinan Jin, Hanlin Yu, Deval Pandya, Xiaoxiao Li: RVCBench: Benchmarking the Robustness of Voice Cloning Across Modern Audio Generation Models https://arxiv.org/abs/2602.00443 https://arxiv.org/pdf/2602.00443 https://arxiv.org/html/2602.00443
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Alabi Ahmed, Vandana Janeja, Sanjay Purushotham: Multi-Speaker Conversational Audio Deepfake: Taxonomy, Dataset and Pilot Study https://arxiv.org/abs/2602.00295 https://arxiv.org/pdf/2602.00295 https://arxiv.org/html/2602.00295
February 3, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Keisuke Kamahori, Wei-Tzu Lee, Atindra Jha, Rohan Kadekodi, Stephanie Wang, Arvind Krishnamurthy, Baris Kasikci: VoxServe: Streaming-Centric Serving System for Speech Language Models https://arxiv.org/abs/2602.00269 https://arxiv.org/pdf/2602.00269 https://arxiv.org/html/2602.00269
February 3, 2026 at 6:33 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Zhipeng Chen, Xinheng Wang, Lun Xie, Haijie Yuan, Hang Pan: LPIPS-AttnWav2Lip: Generic Audio-Driven lip synchronization for Talking Head Generation in the Wild https://arxiv.org/abs/2602.00189 https://arxiv.org/pdf/2602.00189 https://arxiv.org/html/2602.00189
February 3, 2026 at 6:34 AM
Shaoheng Xu, Chunyi Sun, Jihui, Zhang, Prasanga N. Samarasinghe, Thushara D. Abhayapala: RIR-Former: Coordinate-Guided Transformer for Continuous Reconstruction of Room Impulse Responses https://arxiv.org/abs/2602.01861 https://arxiv.org/pdf/2602.01861 https://arxiv.org/html/2602.01861
February 3, 2026 at 6:35 AM
Fran\c{c}ois Deloche, Morgan Thienpont, Sarah Verhulst: Short-wave admittance correction for a time-domain cochlear transmission line model https://arxiv.org/abs/2602.01758 https://arxiv.org/pdf/2602.01758 https://arxiv.org/html/2602.01758
February 3, 2026 at 6:35 AM
Oguzhan Kurnaz, Jagabandhu Mishra, Tomi Kinnunen, Cemal Hanilci: Joint Optimization of ASV and CM tasks: BTUEF Team's Submission for WildSpoof Challenge https://arxiv.org/abs/2602.01722 https://arxiv.org/pdf/2602.01722 https://arxiv.org/html/2602.01722
February 3, 2026 at 6:35 AM
Chenxu Guo, Jiachen Lian, Yisi Liu, Baihe Huang, Shriyaa Narayanan, Cheol Jun Cho, Gopala Anumanchipalli: HuPER: A Human-Inspired Framework for Phonetic Perception https://arxiv.org/abs/2602.01634 https://arxiv.org/pdf/2602.01634 https://arxiv.org/html/2602.01634
February 3, 2026 at 6:35 AM
Yochai Yemini, Yoav Ellinson, Rami Ben-Ari, Sharon Gannot, Ethan Fetaya: SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling https://arxiv.org/abs/2602.01394 https://arxiv.org/pdf/2602.01394 https://arxiv.org/html/2602.01394
February 3, 2026 at 6:35 AM
Yang Xiao, Eun-Jung Holden, Ting Dang: Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages https://arxiv.org/abs/2602.01008 https://arxiv.org/pdf/2602.01008 https://arxiv.org/html/2602.01008
February 3, 2026 at 6:35 AM
Kyung Yun Lee, Nils Meyer-Kahlen, Vesa V\"alim\"aki, Sebastian J. Schlecht: Solving Room Impulse Response Inverse Problems Using Flow Matching with Analytic Wiener Denoiser https://arxiv.org/abs/2602.00652 https://arxiv.org/pdf/2602.00652 https://arxiv.org/html/2602.00652
February 3, 2026 at 6:35 AM
Hao Ma, Ruihao Jing, Shansong Liu, Cheng Gong, Chi Zhang, Xiao-Lei Zhang, Xuelong Li: High-Fidelity Generative Audio Compression at 0.275kbps https://arxiv.org/abs/2602.00648 https://arxiv.org/pdf/2602.00648 https://arxiv.org/html/2602.00648
February 3, 2026 at 6:35 AM
[2026-02-03 Tue (UTC), 8 new articles found for eessAS Audio and Speech Processing]
February 3, 2026 at 6:35 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Seungu Han, Sungho Lee, Kyogu Lee: Rethinking Speech Representation Aggregation in Speech Enhancement: A Phonetic Mutual Information Perspective https://arxiv.org/abs/2601.22480 https://arxiv.org/pdf/2601.22480 https://arxiv.org/html/2601.22480
February 2, 2026 at 6:34 AM
Reposted by arXiv eess.AS Audio and Speech Processing
Chanwoo Park, Chanwoo Kim: An Effective Energy Mask-based Adversarial Evasion Attacks against Misclassification in Speaker Recognition Systems https://arxiv.org/abs/2601.22390 https://arxiv.org/pdf/2601.22390 https://arxiv.org/html/2601.22390
February 2, 2026 at 6:34 AM