💣Synthesis Sound Effects from Vocal Imitations💣
not Limited to the Linguistic Pronunciations

Author
Riki Takizawa¹, Shigeyuki Hirai²
¹: Department of Frontier Informatics, Graduate School of Kyoto Sangyo University, Japan.
²: Faculty of Information Science and Engineering ,Kyoto Sangyo University, Japan.

Abstract

We proposed a method to synthesize sound effects with controlling nuances by representing utterances of onomatopoeia which don't depend on linguistic pronunciations. Figure 1 shows the schematic image of our proposed method, Voice-to-SE. In this method, we utilize Transformer for the conversion of sound effects from utterances.

Proposed Method — Figure 1: Propoded Method

Learn More...

Result1

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result2

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result3

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result4

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result5

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result6

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result7

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result8

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result9

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound

Result10

	Input	Model⁸⁰₈₀	Model⁸⁰₂₅₆	Model²⁵⁶₅₁₂
Waveform and Mel-spec
Sound