Reverberation degrades speech quality, and impairs speech intelligibility. This degradation can also cause difficulties in the process of analyzing speech signals and conducting scientific investigations. In addition, in case of reverberant speech, since the performance of speech recognition is degraded, dereverberation technique is widely employed as a preprocessing. In this paper, we compare the performance of various neural vocoders in a dereverberation technique based on convolutional neural network(CNN). The U-Net architecture was utilized for dereverberation, and WaveGlow, MelGAN, and Griffin Lim were employed as vocoders. These vocoders have a role of receiving speech features as input and reconstruct to speech signals in time-domain. In particular, recent neural vocoders receive mel-spectrogram as an input feature and can reconstruct to high-quality speech signals. To compare the performance of the neural vocoder, we measured perceptual evaluation of speech quality(PESQ), and it was confirmed that all values were relatively high compared to the existing reverberant signals.
목차
Abstract Ⅰ. 서론 Ⅱ. 합성곱 신경망 기반의 음성 잔향 제거 Ⅲ. 성능 평가 Ⅳ. 결론 Ⅴ. 사사 Ⅵ. 참고문헌
키워드
Convolutional Neural Network(CNN)Neural vocoderReverberationRoom Related Transfer FunctionSpeech Dereverberation
법과학 분야는 사회정의 구현에 있어 크나큰 가치가 있음에도 불구하고 우리나라에서는 이 분야에 대한 인식이 미흡하여 선진 외국에 비해 침체되어 있는 실정이다. 이에 우리나라에서도 법과학 분야와 관련 있는 학계, 연구기관, 수사기관 등 유관 단체들로 구성된 한국 법과학회를 창립하여 이 분야를 활성화 시켜 과학수사를 한층 더 발전시키기 위함을 목적으로 한다.