Earticle

베이즈 추론에 의한 단어 중의성 해소
Korean Word Sense Disambiguation by Bayesian Inference

  • 간행물
    언어과학 KCI 등재 바로가기
  • 권호(발행년)
    제15권 2호 (2008.06) 바로가기
  • 페이지
    pp.41-59
  • 저자
    김종휘
  • 언어
    한국어(KOR)
  • URL
    https://www.earticle.net/Article/A73493

원문정보

초록

영어
In this paper multiple senses of some Korean ambiguous words are discriminated on the basis of Bayesian inference which utilizes the conditional probability widely accepted in mathematics. A POS tagged 8.1 million words Korean corpus was used as the resource of the linguistic informations for disambiguation. As a result of disambiguational experiment on the 13 words(9 nouns and 4 verbs) by computational programming of the algorithm based on the Bayesian inference, the whole precision accomplished 81.5%(25981/31874), with 83.5%(12546/15030) for nouns and 79.8%(13435/16844) for verbs respectively. In the course of the experiment some parametric variations were engaged to reveal the optimistic condition for this methodological process. The focus was set on the effect of the variation of the smoothing values from 0.9 to 0.0001 which is substituted for the value 0 of the co-occurrence frequency of a word in the context, and to the contrary of general expectations, smoothing value 0.1 resulted in the topmost precision. In addition to the machine process and its promising result, the way how the individual words of the sentences in the corpus are to be treated under the Bayesian inference is exemplified in this paper in detail, thus clarifying the methodological understanding.

목차

Abstract
 1. 서론
 2. 베이즈 조건 확률과 언어 자료
  2.1. 조건 확률과 베이즈 추론
  2.2. Bayes 추론과 언어 처리
 3. Bayes 추론에 의한 단어 중의성 해소
 4. 실험의 평가
 5. 결론
 참고문헌

저자

  • 김종휘 [ 영산대학교 ]

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      언어과학 [Journal of Language Sciences]
    • 간기
      계간
    • pISSN
      1225-2522
    • 수록기간
      1994~2025
    • 등재여부
      KCI 등재
    • 십진분류
      KDC 705 DDC 405