1. Continual Speaker Adaptation for Text-to-Speech Synthesis Catastrophic Forgetting (CA) happens when we try to fine-tune TTS model with new speakers. It will result in the decrease of performance for existing speakers.
To solve this problem, experience replay (ER) is utilized which will keep a buffer of samples from previous speakers and combine them with current task.
2. Bottleneck Layer Squeeze-and-Excitation Networks
“global pooling -> fc -> relu -> fc -> sigmoid”