Confusion network (CN) is an aligned and compact representation form of word lattice that stores median results of decoding procedure and connects the acoustic decoding and succeeding process steps. The quality of confusion network is very important for multi‐pass decoding based recognition method. Three new quality evaluation measures are proposed for confusion network in this paper. The first is called confusion network word error rate (CWER), which gives a lower bound of recognition word error rate. The second is used to measure the size of confusion set, called confusion word density (CWD). The third is word confusion probability (WCP), which can measure average distinguishability between words in a confusion set. Based on the proposed measures, we present a new method of quality optimization for confusion network, called as confusion probability based pruning algorithm. The experiments, carried out on a large vocabulary Chinese continuous speech recognition system, demonstrate that the new method leads to a significant reduction in CWD and WCP without an increase of CWER.
목차
Abstract 1. Introduction 2. Quality evaluation measure for CN 2.1 Quality evaluation measure for word lattice 2.2 Quality evaluation measure proposed for CN 3. Confusion probability based pruningalgorithm for CN 4. Experiments and evaluation 4.1 Test condition 4.2 Experimental result and analysis 5. Conclusion References
저자
Huanliang Wang [ School of Computer Science and Technology, Harbin Institute of Technology ]
Jiqing Han [ School of Computer Science and Technology, Harbin Institute of Technology ]
Tieran Zheng [ School of Computer Science and Technology, Harbin Institute of Technology ]
한국어정보학회 [Korean Language Information Science Society]
설립연도
1990
분야
인문학>언어학
소개
학술적인 연구를 통하여 국어정보처리에 관련된 이론 체계를 정립하고, 산업계와의 긴밀한 협동을 통하여 정보처리 기술을 향상 시키면서 정보산업의 성장을 돕고, 대중적인 교육과 홍보를 통하여 발전된 정보 처리의 기술을 보급하므로써 국어의 문화적 가치를 높히고 국어정보 처리 기술의 국제적 지위향상과 표준화에 기여하고자 합니다.