About Me

I’m a senior research scientist in Clova Voice at Naver Corporation, Seongnam, Korea (from Mar 2017) and lead the DNN TTS team. I’m also an adjunct professor in Artificial Intelligence Institute at Seoul National University, Seoul, Korea (from Aug 2022).

I received my Ph.D. degree in department of Electrical and Electronics at Yonsei University, Seoul, Korea. During my Ph.D., I served my internships at Microsoft Research Asia, Beijing, China and Qualcomm Technologies Inc., San Diego, CA.

My research interests include speech synthesis and its real-world applications. Specifically, I develop a hybrid TTS system combining deep learning and unit-selection TTS models for cloud-based products such as Clova AI speaker, Clova Dubbing, Naver Maps navigation, and Naver News anchor.

If you are interested in me, feel free to contact me.

Download my CV

Recent Publications

  • HierSpeech: Bridging the gap between text and speech by hierarchical variational inference using self-supervised representations for speech synthesis
    Sang-Hoon Lee, Seung-Bin Kim, Ji-Hyun Lee, Eunwoo Song, Min-Jae Hwang, Seong-Whan Lee
    Accepted to NeurIPS 2022

  • TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder [paper][demo]
    Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim
    Proc. INTERSPEECH, 2022, pp. 1941-1945.

  • Language model-based emotion prediction methods for emotional speech synthesis systems [paper][demo]
    Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang
    Proc. INTERSPEECH, 2022, pp. 4596-4600.

  • Cross-speaker emotion transfer for low-resource text-to-speech using non-parallel voice conversion with pitch-shift data augmentation [paper][demo]
    Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana
    Proc. INTERSPEECH, 2022, pp. 3018-3022.

    [See more]

Recent Talks

  • Parallel waveform synthesis [Slides]
    Samsung Research, Sep 2022

  • Data-selective TTS augmentation [Slides]
    Naver Engineering Day, Jul 2022

  • Voice synthesis and applications [Slides]
    KAIST and SNU, Apr 2022

  • Introduction to text-to-speech [Slides]
    Naver Engineering Day, Apr 2021

  • Deep learning-based text-to-speech [Slides]
    Yonsei Univ. and Korea Univ., Apr 2021

    [See more]