Earticle

다운로드

Improving Speaker Recognition with Parallel WaveGAN

  • 간행물
    한국차세대컴퓨팅학회 학술대회 바로가기
  • 권호(발행년)
    The 9th International Conference on Next Generation Computing 2023 (2023.12) 바로가기
  • 페이지
    pp.296-299
  • 저자
    Kim Dong Jun, Habib Khan, Hikmat Yar
  • 언어
    영어(ENG)
  • URL
    https://www.earticle.net/Article/A448175

원문정보

초록

영어
In recent years, Generative Adversarial Networks (GANs) appeared as a prevailing solution for combating data scarcity in various domains. This study delves into utilizing WaveGAN, a specialized GAN architecture, to address the inherent challenges stemming from the limited availability of audio datasets. Our primary objective is to tackle the issue of constrained audio data resources by utilizing the potential of WaveGAN. Our research is driven by the overarching goal of investigating the capacity of CNN to gather significant insights from an extensive corpus of human speech data. A key focus of our work is to demonstrate the effectiveness of WaveGAN in generating synthetic audio data, thereby expanding the breadth of our audio dataset and bolstering the resilience of our classification models. Our study aims to yield improved classification results, providing crucial insights into the viability of this approach in alleviating data scarcity challenges of audio analysis.

목차

Abstract
I. INTRODUCTION
II. METHOD
A. Extraction of MFCC
B. Parallel WaveGAN
III. EXPERIMENTAL RESULTS
A. Dataset
B. CNN Architecture
C. Results
IV. CONCLUSION
ACKNOWLEDGMENT
REFERENCES

저자

  • Kim Dong Jun [ Sejong University ] Corresponding Author
  • Habib Khan [ Sejong University ]
  • Hikmat Yar [ Sejong University ]

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      한국차세대컴퓨팅학회 학술대회
    • 간기
      반년간
    • 수록기간
      2021~2025
    • 십진분류
      KDC 566 DDC 004