Man-Machine Speech Communication : 20th National Conference, NCMMSC 2025, Zhenjiang, China, October 16-19, 2025, Proceedings (Communications in Computer and Information Science)

個数:
  • 予約

Man-Machine Speech Communication : 20th National Conference, NCMMSC 2025, Zhenjiang, China, October 16-19, 2025, Proceedings (Communications in Computer and Information Science)

  • 現在予約受付中です。出版後の入荷・発送となります。
    重要:表示されている発売日は予定となり、発売が延期、中止、生産限定品で商品確保ができないなどの理由により、ご注文をお取消しさせていただく場合がございます。予めご了承ください。

    ●3Dセキュア導入とクレジットカードによるお支払いについて
  • 【入荷遅延について】
    世界情勢の影響により、海外からお取り寄せとなる洋書・洋古書の入荷が、表示している標準的な納期よりも遅延する場合がございます。
    おそれいりますが、あらかじめご了承くださいますようお願い申し上げます。
  • ◆画像の表紙や帯等は実物とは異なる場合があります。
  • ◆ウェブストアでの洋書販売価格は、弊社店舗等での販売価格とは異なります。
    また、洋書販売価格は、ご注文確定時点での日本円価格となります。
    ご注文確定後に、同じ洋書の販売価格が変動しても、それは反映されません。
  • 製本 Paperback:紙装版/ペーパーバック版
  • 言語 ENG
  • 商品コード 9789819553815

Full Description

This book constitutes the refereed proceedings of the 20th National Conference on Man-Machine Speech Communication, NCMMSC 2025, held in Zhenjiang, China, during October 16-19, 2025.

The 40 papers included in these proceedings were carefully reviewed and selected from 157 submissions. the conference will feature special events such as a Young Scholars Forum, Student Forum, Industry Forum, and Product and Technology Exhibition. Beyond the main program, the conference will also include publicoutreach activities, grant-writing workshops, and several special sessions.

Contents

.- Zero- and One-Shot Data Augmentation for Sentence-Level Dysarthric Speech
Recognition in Constrained Scenarios.

.- Multilevel and Granular L2 Pronunciation Assessment Using Stress-Based
Suprasegmental Features and Proficiency Adaptation.

.- CDMGTU-Net: A Causal Dual-Branch Multi-Channel Speech Enhancement Network
with Multi-Scale Gateted Feature Fusion.

.- A Two-Stage Band-Split Mamba-2 Network For Music Source Separation.

.- Ideal-LLM: Integrating Dual Encoders and Language-Adapted LLM for Multilingual Speech-to-Text.

.- MambaVoc: State Space Models for High-Fidelity Audio Synthesis.

.- StreamFlow: Streaming Flow Matching with Block-wise Guided Attention Mask for Speech Token Decoding.

.- Automatic Speech Evaluation Method Leveraging Deep Feature Fusion.

.- Curriculum Reinforcement Learning for Robust Low-Resource Chinese Dialect Speech Recognition.

.- An Acoustic Study on Intonation Production of English Learners from Guanzhong Region in Shaanxi Province.

.- Improving Anomalous Sound Detection with Top-M Pseudo-Labeling.

.- Dementia Detection via Speech Temporal Sequences with Shifted Windows.

.- CL-EDiff: Cross-lingual emotional TTS system based on diffusion model.

.- When AI Speaks, Do We Follow? Phonetic Entrainment in Human-AI Dialogues.

.- Aishell1Mix: Towards Robust Mandarin Speech Separation with Scalable Audio Language Models.

.- Study of the Low-Rank Minimum Variance Distortionless Response Beamformer for Speech Enhancement.

.- Exploring Gender Bias in Alzheimer's Disease Detection: Insights from Mandarin and Greek Speech Perception.

.- UniDaugMamba: A Unimodal Data-augmented Mamba for Speech-Based Depression Detection.

.- Serial-Parallel Dual-Path Architecture for Speaking Style Recognition.

.- Knowledge Augmented Finetuning Matters in Both RAG and Agent Based Dialog Systems.

.- NC-KWS: Few-Shot Class-Incremental Keyword Spotting Based on Neural Collapse.

.- ZSEmo-MTVITS: A Zero-Shot Cross-Lingual Emotional Speech Synthesis Model for Mandarin and Tibetan Based on VITS.

.- CUHK-EE Systems for the vTAD Challenge at NCMMSC 2025.

.- Accent Familiarity and Phonological Weighting in Spoken-Word Recognition.

.- Audio Deepfake Detection via Dual Branch Classifier with Self-Supervised Pre-Trained Model.

.- A Multi-Subspace Attention Approach for Robust Speech Spoofing Detection in Silence-Trimming Conditions.

.- Temporally Consistent Teeth Restoration for Talking Heads.

.- EEG as a Biometric Identifier: The Impact of Electrode Arrangement, Brain Areas, and Frequency Bands.

.- The Phonetic Modification and Facial Movements Made During Mandarin Vowel and Tone Production in Noise.

.- Exploring Audio-Visual Fusion for Sound Event Localization and Detection with BEATs.

.- On Multi-Input Multi-Frame MVDR Filter for Speech Enhancement with Heterophasic Presentation.

.- Adaptive Multi-source Fusion for Uyghur ASR Error Correction.

.- The determinants of Chinese lexical stress.

.- Introducing Discriminative Speaker Embeddings for Voice Timbre Attribute Detection.

.- TSELM: Target Speaker Extraction using Discrete Tokens and Language Models.

.- A Timbre Attribute Discrimination System Fusing Pre-trained Speaker Feature Extractors with Gender Prior Features.

.- Improving the Robustness of Audio-Visual Target Speaker Extraction With AV-HuBERT Based Lip Features.

.- A Hierarchical Fusion Modeling from Perception to Prediction with Personalized Features for Multimodal Depression Detection.

.- Revisiting Target Signal Definitions in Distortionless Superdirective Beamforming for Reverberant Speech Enhancement.

.- HiStyle: Hierarchical Style Embedding Prediction for Text-Prompt-Guided Controllable Speech Synthesis.

最近チェックした商品