Lightnews — Scholar-powered news

Reposted by arXiv eess.AS Audio and Speech Processing

@cscr-bot.bsky.social

Seyed Ali Ghazi Asgar, Narasimha Reddy: QuietPrint: Protecting 3D Printers Against Acoustic Side-Channel Attacks https://arxiv.org/abs/2602.02198 https://arxiv.org/pdf/2602.02198 https://arxiv.org/html/2602.02198

February 3, 2026 at 6:30 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Yang, Zhao, Kang, Li, He, Liu, Zhang, Qu, Peng, Wang: Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition https://arxiv.org/abs/2602.01547 https://arxiv.org/pdf/2602.01547 https://arxiv.org/html/2602.01547

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Mari\"ette Olijslager, Seyed Sahand Mohammadi Ziabari, Ali Mohammed Mansoor Alsahag: Causally Disentangled Contrastive Learning for Multilingual Speaker Embeddings https://arxiv.org/abs/2602.01363 https://arxiv.org/pdf/2602.01363 https://arxiv.org/html/2602.01363

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Chengyuan Ma, Peng Jia, Hongyue Guo, Wenming Yang: TLDiffGAN: A Latent Diffusion-GAN Framework with Temporal Information Fusion for Anomalous Sound Detection https://arxiv.org/abs/2602.01060 https://arxiv.org/pdf/2602.01060 https://arxiv.org/html/2602.01060

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Zhili Nicholas Liang, Soyeon Caren Han, Qizhou Wang, Christopher Leckie: HierCon: Hierarchical Contrastive Attention for Audio Deepfake Detection https://arxiv.org/abs/2602.01032 https://arxiv.org/pdf/2602.01032 https://arxiv.org/html/2602.01032

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

Wei, Liao, Chang, Huang, Chen: Bias in the Ear of the Listener: Assessing Sensitivity in Audio Language Models Across Linguistic, Demographic, and Positional Variations https://arxiv.org/abs/2602.01030 https://arxiv.org/pdf/2602.01030 https://arxiv.org/html/2602.01030

February 3, 2026 at 6:30 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.CL Computation and Language

@cscl-bot.bsky.social

V\'ictor Yeste, Rodrigo Rivas-Ar\'evalo: A Baseline Multimodal Approach to Emotion Recognition in Conversations https://arxiv.org/abs/2602.00914 https://arxiv.org/pdf/2602.00914 https://arxiv.org/html/2602.00914

February 3, 2026 at 6:30 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Ayuto Tsutsumi, Kohei Tanaka, Sayaka Shiota: The TMU System for the XACLE Challenge: Training Large Audio Language Models with CLAP Pseudo-Labels https://arxiv.org/abs/2602.00604 https://arxiv.org/pdf/2602.00604 https://arxiv.org/html/2602.00604

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Ke Xue, Rongfei Fan, Kai Li, Shanping Yu, Puning Zhao, Jianping An: Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy https://arxiv.org/abs/2602.00568 https://arxiv.org/pdf/2602.00568 https://arxiv.org/html/2602.00568

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Yong Ren, Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Tao Wang: Edit Content, Preserve Acoustics: Imperceptible Text-Based Speech Editing via Self-Consistency Rewards https://arxiv.org/abs/2602.00560 https://arxiv.org/pdf/2602.00560 https://arxiv.org/html/2602.00560

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Xinting Liao, Ruinan Jin, Hanlin Yu, Deval Pandya, Xiaoxiao Li: RVCBench: Benchmarking the Robustness of Voice Cloning Across Modern Audio Generation Models https://arxiv.org/abs/2602.00443 https://arxiv.org/pdf/2602.00443 https://arxiv.org/html/2602.00443

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Alabi Ahmed, Vandana Janeja, Sanjay Purushotham: Multi-Speaker Conversational Audio Deepfake: Taxonomy, Dataset and Pilot Study https://arxiv.org/abs/2602.00295 https://arxiv.org/pdf/2602.00295 https://arxiv.org/html/2602.00295

February 3, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.LG Machine Learning

@cslg-bot.bsky.social

Keisuke Kamahori, Wei-Tzu Lee, Atindra Jha, Rohan Kadekodi, Stephanie Wang, Arvind Krishnamurthy, Baris Kasikci: VoxServe: Streaming-Centric Serving System for Speech Language Models https://arxiv.org/abs/2602.00269 https://arxiv.org/pdf/2602.00269 https://arxiv.org/html/2602.00269

February 3, 2026 at 6:33 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Zhipeng Chen, Xinheng Wang, Lun Xie, Haijie Yuan, Hang Pan: LPIPS-AttnWav2Lip: Generic Audio-Driven lip synchronization for Talking Head Generation in the Wild https://arxiv.org/abs/2602.00189 https://arxiv.org/pdf/2602.00189 https://arxiv.org/html/2602.00189

February 3, 2026 at 6:34 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Shaoheng Xu, Chunyi Sun, Jihui, Zhang, Prasanga N. Samarasinghe, Thushara D. Abhayapala: RIR-Former: Coordinate-Guided Transformer for Continuous Reconstruction of Room Impulse Responses https://arxiv.org/abs/2602.01861 https://arxiv.org/pdf/2602.01861 https://arxiv.org/html/2602.01861

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Fran\c{c}ois Deloche, Morgan Thienpont, Sarah Verhulst: Short-wave admittance correction for a time-domain cochlear transmission line model https://arxiv.org/abs/2602.01758 https://arxiv.org/pdf/2602.01758 https://arxiv.org/html/2602.01758

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Oguzhan Kurnaz, Jagabandhu Mishra, Tomi Kinnunen, Cemal Hanilci: Joint Optimization of ASV and CM tasks: BTUEF Team's Submission for WildSpoof Challenge https://arxiv.org/abs/2602.01722 https://arxiv.org/pdf/2602.01722 https://arxiv.org/html/2602.01722

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Chenxu Guo, Jiachen Lian, Yisi Liu, Baihe Huang, Shriyaa Narayanan, Cheol Jun Cho, Gopala Anumanchipalli: HuPER: A Human-Inspired Framework for Phonetic Perception https://arxiv.org/abs/2602.01634 https://arxiv.org/pdf/2602.01634 https://arxiv.org/html/2602.01634

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Yochai Yemini, Yoav Ellinson, Rami Ben-Ari, Sharon Gannot, Ethan Fetaya: SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling https://arxiv.org/abs/2602.01394 https://arxiv.org/pdf/2602.01394 https://arxiv.org/html/2602.01394

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Yang Xiao, Eun-Jung Holden, Ting Dang: Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages https://arxiv.org/abs/2602.01008 https://arxiv.org/pdf/2602.01008 https://arxiv.org/html/2602.01008

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Kyung Yun Lee, Nils Meyer-Kahlen, Vesa V\"alim\"aki, Sebastian J. Schlecht: Solving Room Impulse Response Inverse Problems Using Flow Matching with Analytic Wiener Denoiser https://arxiv.org/abs/2602.00652 https://arxiv.org/pdf/2602.00652 https://arxiv.org/html/2602.00652

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Hao Ma, Ruihao Jing, Shansong Liu, Cheng Gong, Chi Zhang, Xiao-Lei Zhang, Xuelong Li: High-Fidelity Generative Audio Compression at 0.275kbps https://arxiv.org/abs/2602.00648 https://arxiv.org/pdf/2602.00648 https://arxiv.org/html/2602.00648

February 3, 2026 at 6:35 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

[2026-02-03 Tue (UTC), 8 new articles found for eessAS Audio and Speech Processing]

February 3, 2026 at 6:35 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Seungu Han, Sungho Lee, Kyogu Lee: Rethinking Speech Representation Aggregation in Speech Enhancement: A Phonetic Mutual Information Perspective https://arxiv.org/abs/2601.22480 https://arxiv.org/pdf/2601.22480 https://arxiv.org/html/2601.22480

February 2, 2026 at 6:34 AM

Reposted by arXiv eess.AS Audio and Speech Processing

arXiv cs.SD Sound

@cssd-bot.bsky.social

Chanwoo Park, Chanwoo Kim: An Effective Energy Mask-based Adversarial Evasion Attacks against Misclassification in Speaker Recognition Systems https://arxiv.org/abs/2601.22390 https://arxiv.org/pdf/2601.22390 https://arxiv.org/html/2601.22390

February 2, 2026 at 6:34 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news