Transfer Learning in Automatic Speech Recognition for the Adaptation of an Adult Acoustic Model(...)

Abstract: Transfer Learning in Automatic Speech Recognition for the Adaptation of an Adult Acoustic Model to a Child Acoustic Model Although adult automatic speech recognition (ASR) systems obtained a good recognition score, there is a gap in term of performance with the same automatic speech recognition systems on children. This is due to the high variability in children’s speech and the lack of large amounts of available data for children. However, some improvements have been achieved using a deep neural network framework for acoustic modeling. One way explored was to adapt an adult acoustic model into children acoustic model using transfer learning methods. In this work, we want to study transfer learning influence in speech recognition performance with our children European Portuguese corpora. To tackle this challenge, we use a time-delayed neural network as an acoustic model and give as input mfcc(s) features. We present here the results of 3 different experiments: (I) Train the acoustic model only on adult data; (II) Train the acoustic model only on child data; (III) Train the model on adult data and use transfer learning with child data.

Tags: