- ホーム
- > 洋書
- > 英文書
- > Computer / General
Full Description
This two-set volume LNAI 16187 and 16188 constitutes the refereed proceedings of the 27th International Conference on Speech and Computer SPECOM 2025 held in Szeged, Hungary, during October 13-15, 2025.
The 47 full papers and 1 invited paper included in this book were carefully reviewed and selected from 77 submissions. The papers are organized in the following topical sections:
Part I: Invited Paper; Speech Perception and Synthesis; Computational Paralinguistics; Speech Processing for Healthcare; Speech and Language Resources; Speaker Recognition.
Part II: Automatic Speech Recognition; Speech Processing for Under-Resourced Languages; Digital Speech Processing; Natural Language Processing; Multimodal Systems.
Contents
.- Automatic Speech Recognition.
.- In-Domain SSL Pre-Training and Streaming ASR: Application to Air Traffic Control Communications.
.- Evaluating the Performance of Several ASR Systems in Environmental and Industrial Noise.
.- Ground Truth-Free WER Prediction for ASR via Audio Quality and Model Confidence Features.
.- Enhancing Speech Recognition through Text-to-Speech and Voice Conversion Augmentation.
.- Best Data is more Supervised Data - Even for Hungarian ASR.
.- Arabic ASR on the SADA Large-Scale Arabic Speech Corpus with Transformer-based Models.
.- Speech Processing for Under-Resourced Languages.
.- Effect of Increased Temporal Resolution on Speech Recognition for French Quebec using Features from Speech Self-Supervised Learning Models.
.- Modeling Intra-Word Code-Switching for Karelian ASR.
.- Improving Whisper-based Serbian ASR using Synthetic Speech.
.- Domain Knowledge and Language Embeddings for Low-Resource Multilingual Phoneme ASR.
.- Whistler Identification in Whistled Spanish (Silbo): A Case Study.
.- Digital Speech Processing.
.- PinkVocalTransformer: Neural Acoustic-to-Articulatory Inversion based on the Pink Trombone.
.- CrossMP-SENet: Transformer-based Cross-Attention for Joint Magnitude-Phase Speech Enhancement.
.- Adaptive Singing Voice Enhancement for Live Stages.
.- Revealing the Hidden Temporal Structure of HubertSoft Embeddings based on the Russian Phonetic Corpus.
.- Natural Language Processing.
.- Analyzing Web-Scraped and Generated Inputs for Automatic and Scalable Intent Classification.
.- Enhancing Retrieval Performance via LLM Hard-Negative Filtering.
.- Sector-Wise Backpropagation for Low-Resource Text Classification in Deep Models.
.- High-Frequency Multiword Units and the Typological Distribution of Multiword Units in Spoken Russian.
.- Estimation of the Genre Composition of the English Subcorpus of the Google Books Ngram.
.- Multimodal Systems.
.- Ensembling Synchronisation-based and Face-Voice Association Paradigms for Robust Active Speaker Detection in Egocentric Recordings.
.- Phonetic and Visual Characteristics of Cognitive Load.
.- Cognitive Humor Processing in the Russian and English Internet Meme Chatting: EEG Study.
.- Saudi Sign Language Translation Using T5.



