Supplementary material from "Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody"
Abstract: Existing emotional speech corpora are often either highly curated to induce certain emotions from
predefined categories, or it is difficult to dissect whether the judgment made by the annotators arise from semantic
or prosodic cues. To overcome this challenge, we propose a new approach called `Genetic Algorithm with People’
(GAP), which integrates human decision and production into a genetic algorithm. In our design, we allow human
creators and raters to synchronously optimize the emotional prosody over each generation by introducing evolutionary
pressure. We then compare GAP with state-of-the-art emotional speech corpora to examine its robustness. We
demonstrate that GAP is able to efficiently sample from the high dimensional emotion speech space and capture wide
variety of emotions, with no presumptions about the dimensionality of the space. Our approach has great potential to
scale and extend to other languages, which will allow for creating multilingual corpora to make cross-cultural
comparisons.
Contents
Supplementary figures
Used sentences in experiment
- It's eight o'clock. (inspired by Crema-D and Juslin & Laukka 2001)
- Is it eleven o'clock? (inspired by Crema-D and Juslin & Laukka 2001)
- Let me tell you something. (taken from VENEC)
- That's exactly what happened. (taken from VENEC)
- I wonder what this is about. (taken from Crema-D)
- I think I've seen this before. (taken from Crema-D)
- We'll stop in a couple of minutes. (taken from Crema-D)
- The surface is slick. (taken from Crema-D)
- Help the woman get back to her feet. (taken from Harvard sentences)
- What joy there is in living. (taken from Harvard sentences)
Chain visualization
Figure S1: Interactive visualization of the created chains. Follow the instructions on
the prompt.