2019 (11)
2018 (13)
2017 (11)
2016 (13)
2013 (10)
2012 (16)
2011 (11)
2010 (25)
2009 (21)
2008 (36)
2007 (35)
2006 (25)
2005 (40)
2003 (14)
2002 (36)
2001 (16)
The Estimating Method of Statistical Language Models Perplexity and Chinese Entropy
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.1-6
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
A quantified reasoning and description of the perplexity for evaluating language models by using the concept of information entropy is discussed in this article: The smaller the entropy of the language estimated by the language model is, the more precise the language model is; an interpolated model based on two (n‐1)‐gram models is better than the (n‐1)‐gram component models, but not a n‐gram model. We also explore the methods to estimating the entropy of Chinese using language models.
A Fast Statistical Method for Chinese Unknown Word Detection
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.7-11
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
A fast statistical method for Chinese unknown word detection is proposed. It is based on association measure, heuristic features and LocalMaxs algorithm. The unknown words discovered are not limited to certain word patterns. It is also effective for low frequency unknown words and sensitive to new coming texts. Experiment shows its effectiveness.
Research On Adaptive Courseware By Using SCORM
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.12-16
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
Sharable Content Object Reference Model is the most widely accepted publication of Advanced Distributed Learning, In this paper, we are going to using SCORM Sequencing Definition model to construct adaptive courseware. At first introduces the architecture of the whole system; Then introduces the way to build adaptive courseware, Finally gives a conclusion and some work in the future.
Bayesian Network Model for XML Document Ranking
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.23-28
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
As more and more data is described, stored, exchanged and represented by XML, the abilities of information retrieval for XML document become increasingly important. However, the retrieval results to users are quite large. This paper gives a Bayesian network‐based model for ranking these large results. Each XML document is modeled through a Bayesian network, which can handle both structure and content for the document. And then this paper presents the inference for the probability of each document on the given query. Finally documents are ranked according to the probabilities in descent.
A New Approach for Test Detection Using Homoge
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.29-33
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
In this paper, a new approach for text detection in images and video based on homogeneity is studied. The texture analysis is applied to the homogeneity domain. Both local information and global information are used while calculating the homogeneity feature. Text region property is confirmed by using neural network trained to extract property feature by a fixed size text detector in homogeneity domain. Comparisons with text detection edge‐based method show that the proposed method has a better accuracy.
Automatic Acquisition Of Translation Equivalences From Bilingual Corpus
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.34-39
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
Translation equivalence is very useful for bilingual lexicography, machine translation system, cross‐lingual information retrieval and many applications in natural language processing. A linear combination model of multiple features is used to filter extracted equivalences in this paper. Experimental results indicate that performance of the combination model surpasses other classifiers’ in open test. 1000 equivalences labeled by linear combination model are randomly selected and then evaluated. Its F1 measure achieves 88.13%. Its performances surpass those classifiers.
Research on Domain Self‐Aadaptation of Chinese‐English EBMT
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.40-42
※ 기관로그인 시 무료 이용이 가능합니다.
3,000원
The example‐based machine translation system in the specific domain can be developed in a short time with a high translation quality. Though an EBMT system can be transplanted to a new domain quickly, when faces to the need for the multi‐doamins translation, its advantage in domain adaptability will be affected. In order to solve this problem a new domain sensative EBMT translation model is proposed. Through combining the text‐classsify technique, the proposed EBMT will judge the input text and then select the most appropriate example base for the following translations. The experiments showed that this method can improve the performance of the EBMT system and meet the need for Olympicsoriented multi‐domains translations in some extents
4,000원
This paper presents a robust audio retrieval method. In the method, the retrieval target is divided into short segments, each segment is searched respectively, and a retrieval window is used to maintain a list of segments that can be searched simultaneously. The method can quickly detect and locate known sound in real‐time audio stream, multimedia archives or the Internet. It can maintain high performance even if large part of target is absent in the input stream. Its retrieval speed can be adjusted by the length of retrieval window and is independent on target length. The recall rate and precision rate of the method are 100% and 99.7% respectively.
Quality Evaluation and Optimization of Confusion Network for LVCSR
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.48-53
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
Confusion network (CN) is an aligned and compact representation form of word lattice that stores median results of decoding procedure and connects the acoustic decoding and succeeding process steps. The quality of confusion network is very important for multi‐pass decoding based recognition method. Three new quality evaluation measures are proposed for confusion network in this paper. The first is called confusion network word error rate (CWER), which gives a lower bound of recognition word error rate. The second is used to measure the size of confusion set, called confusion word density (CWD). The third is word confusion probability (WCP), which can measure average distinguishability between words in a confusion set. Based on the proposed measures, we present a new method of quality optimization for confusion network, called as confusion probability based pruning algorithm. The experiments, carried out on a large vocabulary Chinese continuous speech recognition system, demonstrate that the new method leads to a significant reduction in CWD and WCP without an increase of CWER.
A Novel Approach to Digital Audio Watermarking Based on Pre-attack
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.54-57
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
In this paper, a novel algorithm of audio watermarking is presented on the base of preattacking by predicting possible attacks for the special purpose of watermarking. Dividing frequency band in discrete wavelet transform (DWT) domain, and calculating the defined distortion measure, the appropriate band can be located to embed watermarking with hopping frequency technique. The imperceptible of audio host maintains by using psychoacoustics model to constrain the embedding strength. Experimental results show the proposed method obtains higher robustness to common watermarking attacks, especially for the forecasted attacks.
Thesaurus-Based Semantic Smoothing in Language Modeling for Chinese Document Retrieval
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.58-63
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
Language modeling for Information Retrieval proposed a few years ago has been attractive and improved the performance of IR systems effectively comparing to classic models and approaches. Smoothing technology in parameter estimations is one of main problems in carrying out language models. The performance of IR system will be enhanced by effective smoothing methods. Semantic smoothing has been developed recently for language modeling with some knowledge of language. This paper presents a modification to a smoothing approach in general language model combining with translation modeling, which is taking synonyms in documents and the collection into account for semantic smoothing and performance improving in Chinese document retrieval. The synonym knowledge is from a well‐known thesaurus in Chinese NLP, called Tongyici Cilin (Extended). A comparison shows that the semantic smoothed approach brings approximately 1.33% improvement on average.
The Topic Detection and Tracking with Topic Sensitive Language Model
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.64-68
※ 기관로그인 시 무료 이용이 가능합니다.
4,000원
In this paper, we explore the language model with topic sensitive features for the topic detection and tracking, formulate the relationship among the Chinese internet new words, language model with topic sensitive feature and the scheduling logic and the interval temporal reasoning and the key techniques. we use the Chinese internet new words to strengthen the detection and tracking of the topic and try to employ the scheduling logic and interval temporal reasoning to educe the reciprocal influences of events. At last we summarize the potential issues and the future work.
Genetic Algorithm for Eight-Queen Problem
한국어정보학회 한국어정보학 제8권 1호 2006.06 pp.69-71
※ 기관로그인 시 무료 이용이 가능합니다.
3,000원
This paper uses the real‐coding genetic algorithm to solve the 8‐queen problem that is the constraint problem of classic AI, and the good simulated result is given by MATLAB.
0개의 논문이 장바구니에 담겼습니다.
선택하신 파일을 압축중입니다.
잠시만 기다려 주십시오.