Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification

dc.contributor.authorSarria Paja, Milton
dc.contributor.authorFalk, Tiago H.
dc.date.accessioned2020-02-10T07:20:06Z
dc.date.available2020-02-10T07:20:06Z
dc.date.issued2017-03-25
dc.description.abstractWhispered speech is a natural speaking style that despite its reduced perceptibility, still contains relevant information regarding the intended message (i.e., intelligibility), as well as the speaker identity and gender. Given the acoustic differences between whispered and normally-phonated speech, however, speech applications trained on the latter but tested with the former exhibit unacceptable performance levels. Within an automated speaker verification task, previous research has shown that i) conventional features (e.g., mel-frequency cepstral coefficients, MFCCs) do not convey sufficient speaker discrimination cues across the two vocal efforts, and ii) multi-condition training, while improving the performance for whispered speech, tends to deteriorate the performance for normal speech. In this paper, we aim to tackle both shortcomings by proposing three innovative features, which when fused at the score level, are shown to result in reliable results for both normal and whispered speech. Overall, relative improvements of 66% and 63% are obtained for whispered and normal speech, respectively, over a baseline system based on MFCCs and multi-condition training.es
dc.identifier.issn08852308
dc.identifier.urihttps://repositorio.usc.edu.co/handle/20.500.12421/2744
dc.language.isoenes
dc.publisherAcademic Presses
dc.subjectWhispered speeches
dc.subjectSpeaker verificationes
dc.subjectModulation spectrumes
dc.subjectMutual informationes
dc.subjectSystem fusiones
dc.titleFusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verificationes
dc.typeArticlees

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Fusion of auditory inspired amplitude modulation spectrum and cepstral.jpg
Size:
139.97 KB
Format:
Joint Photographic Experts Group/JPEG File Interchange Format (JFIF)
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: