Nisan Aryal, Sung-Hwan Park, Sung-yoon Ahn, Sang-Woong Lee
언어
영어(ENG)
URL
https://www.earticle.net/Article/A409398
원문정보
초록
영어
Speech enhancement is the task of improving the quality of the speech by reducing the noise. The magnitude of the short-time Fourier transform(STFT) or spectrogram is widely used for speech enhancement. However, this approach neglects the noisy phase and limits the quality of enhancement. Recently, short-time discrete cosine transform(STDCT) has been introduced to overcome the limitation of the STFT. STDCT is a real value representation; thus, it does not require phase information to reconstruct the audio. This paper compares the two approaches and analyzes the importance of phase information in speech enhancement. Our experiment shows that when trained under similar condition STFT performs better than STDCT in low noise scenarios, however, for high noise situations, STDCT has better performance than STFT.
목차
Abstract 1. Introduction 2. Related Works 3. Methods 3.1. Unet 4. Experiments 4.1. Experimental setup and dataset 4.2. Experimental result 5. Conclusions Acknowledgement References
키워드
Speech enhancementspectrogramdiscreate cosine transformshort time fourier transform
저자
Nisan Aryal [ Department of IT Convergence Engineering Gachon University Gyeonggi-do, South Korea ]
Sung-Hwan Park [ Dept. of Nano Science and Technology Gachon University ]
Sung-yoon Ahn [ Department of Software Gachon University ]
Sang-Woong Lee [ Department of Software Gachon University ]
Corresponding Author