💣Synthesis Sound Effects from Vocal Imitations💣
not Limited to the Linguistic Pronunciations


Author
Riki Takizawa1, Shigeyuki Hirai2
1: Department of Frontier Informatics, Graduate School of Kyoto Sangyo University, Japan.
2: Faculty of Information Science and Engineering ,Kyoto Sangyo University, Japan.


Abstract

We proposed a method to synthesize sound effects with controlling nuances by representing utterances of onomatopoeia which don't depend on linguistic pronunciations. Figure 1 shows the schematic image of our proposed method, Voice-to-SE. In this method, we utilize Transformer for the conversion of sound effects from utterances.

Proposed Method
Figure 1: Propoded Method

Learn More...


Result1

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result2

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result3

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result4

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result5

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result6

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result7

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result8

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result9

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound

Result10

Input
Model8080
Model80256
Model256512
Waveform and
Mel-spec
Sound