PHONEMEBERT: JOINT LANGUAGE MODELLING OF PHONEME SEQUENCE AND ASR TRANSCRIPT

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:INTERSPEECH (22. : 2021 : Brünn; Online) 22nd Annual Conference of the International Speech Communication Association (INTERSPEECH 2021) ; Volume 5 of 6
1. Verfasser: Sundararaman, Mukuntha Narayanan (VerfasserIn)
Weitere Verfasser: Kumar, Ayush (VerfasserIn), Vepa, Jithendra (VerfasserIn)
Pages:22
Format: UnknownFormat
Sprache:eng
Veröffentlicht: 2021
Schlagworte:
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Titel Jahr Verfasser
HARMONIC WAVEGAN: GAN-BASED SPEECH WAVEFORM GENERATION MODEL WITH HARMONIC STRUCTURE DISCRIMINATOR 2021 Mizuta, Kazuki
INTRA-SENTENTIAL SPEAKING RATE CONTROL IN NEURAL TEXT-TO-SPEECH FOR AUTOMATIC DUBBING 2021 Sharma, Mayank
ADEPT: A DATASET FOR EVALUATING PROSODY TRANSFER 2021 Torresquintero, Alexandra
CONFIDENCE INTERVALS FOR ASR-BASED TTS EVALUATION 2021 Taylor, Jason
A LEARNED CONDITIONAL PRIOR FOR THE VAE ACOUSTIC SPACE OF A TTS SYSTEM 2021 Karanasou, Penny
PARALLEL TACOTRON 2: A NON-AUTOREGRESSIVE NEURAL TTS MODEL WITH DIFFERENTIABLE DURATION MODELING 2021 Elias, Isaac
SEQUENCE-TO-SEQUENCE LEARNING FOR DEEP GAUSSIAN PROCESS BASED SPEECH SYNTHESIS USING SELF-ATTENTION GP LAYER 2021 Nakamura, Taiki
PNG BERT: AUGMENTED BERT ON PHONEMES AND GRAPHEMES FOR NEURAL TTS 2021 Jia, Ye
POLYPHONE DISAMBIGUATION IN MANDARIN CHINESE WITH SEMI-SUPERVISED LEARNING 2021 Shi, Yi
CONVERSION OF AIRBORNE TO BONE-CONDUCTED SPEECH WITH DEEP NEURAL NETWORKS 2021 Pucher, Michael
CTRL-P: TEMPORAL CONTROL OF PROSODIC VARIATION FOR SPEECH SYNTHESIS 2021 Mohan, Devang S. Ram
INVESTIGATING CONTRIBUTIONS OF SPEECH AND FACIAL LANDMARKS FOR TALKING HEAD GENERATION 2021 Kesim, Ege
STYLER: STYLE FACTOR MODELING WITH RAPIDITY AND ROBUSTNESS VIA SPEECH DECOMPOSITION FOR EXPRESSIVE AND CONTROLLABLE NEURAL TEXT TO SPEECH 2021 Lee, Keon
REINFORCEMENT LEARNING FOR EMOTIONAL TEXT-TO-SPEECH SYNTHESIS WITH IMPROVED EMOTION DISCRIMINABILITY 2021 Liu, Rui
ADAPTIVE TEXT TO SPEECH FOR SPONTANEOUS STYLE 2021 Yan, Yuzi
ZERO-SHOT TEXT-TO-SPEECH FOR TEXT-BASED INSERTION IN AUDIO NARRATION 2021 Tang, Chuanxin
TRIPLE M: A PRACTICAL TEXT-TO-SPEECH SYNTHESIS SYSTEM WITH MULTI- GUIDANCE ATTENTION AND MULTI-BAND MULTI-TIME LPCNET 2021 Lin, Shilun
FASTPITCHFORMANT: SOURCE-FILTER BASED DECOMPOSED MODELING FOR SPEECH SYNTHESIS 2021 Bak, Taejun
FAKE AUDIO DETECTION IN RESOURCE-CONSTRAINED SETTINGS USING MICROFEATURES 2021 Dhamyal, Hira
LEVERAGING ASR N-BEST IN DEEP ENTITY RETRIEVAL 2021 Wang, Haoyu
Alle Artikel auflisten