Achieving Timestamp Prediction While Recognizing with Non-autoregressive End-to-End ASR Model

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:NCMMSC (17. : 2022 : Hefei) Man-machine speech communication
1. Verfasser: Shi, Xian (VerfasserIn)
Weitere Verfasser: Chen, Yanni (VerfasserIn), Zhang, Shiliang (VerfasserIn), Yan, Zhijie (VerfasserIn)
Format: UnknownFormat
Sprache:eng
Veröffentlicht: 2023
Schlagworte:
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Titel Jahr Verfasser
Adversarial Training Based on Meta-Learning in Unseen Domains for Speaker Verification 2023 Zhang, Jian-Tao
Pre-training Techniques for Improving Text-to-Speech Synthesis by Automatic Speech Recognition Based Data Enhancement 2023 Liu, Yazhu
A Time-Frequency Attention Mechanism with Subsidiary Information for Effective Speech Emotion Recognition 2023 Xi, Yu-Xuan
Improving Fine-Grained Emotion Control and Transfer with Gated Emotion Representations in Speech Synthesis 2023 Ye, Jianhao
Transformer-Based Potential Emotional Relation Mining Network for Emotion Recognition in Conversation 2023 Shi, Yunwei
FastFoley: Non-autoregressive Foley Sound Generation Based on Visual Semantics 2023 Li, Sipan
MnTTS2: An Open-Source Multi-speaker Mongolian Text-to-Speech Sythesis Dataset 2023 Liang, Kailin
A Multi-feature Sets Fusion Strategy with Similar Samples Removal for Snore Sound Classification 2023 Zhao, Zhonghao
Multi-hypergraph Neural Networks for Emotion Recognition in Multi-party Conversations 2023 Zheng, Cheng
Multi-speaker Multi-style Speech Synthesis with Timbre and Style Disentanglement 2023 Song, Wei
Multiple Confidence Gates for Joint Training of SE and ASR 2023 Wang, Tianrui
Violence Detection Through Fusing Visual Information to Auditory Scene 2023 Li, Hongwei
VC-AUG: Voice Conversion Based Data Augmentation for Text-Dependent Speaker Verification 2023 Qin, Xiaoyi
Mongolian Text-to-Speech Challenge Under Low-Resource Scenario for NCMMSC2022 2023 Liu, Rui
Dual Learning for Dialogue State Tracking 2023 Chen, Zhi
MCPN: A Multiple Cross-Perception Network for Real-Time Emotion Recognition in Conversation 2023 Liu, Weifeng
Baby Cry Recognition Based on Acoustic Segment Model 2023 Wang, Shuxian
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speccli-SYnthesis 2023 Lu, Ye-Xin
Using Emoji as an Emotion Modality in Text-Based Depression Detection 2023 Zhang, Pingyue
Predictive AutoEncoders Are Context-Aware Unsupervised Anomalous Sound Detectors 2023 Zeng, Xiao-Min
Alle Artikel auflisten