Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents

기술

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents
학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도

간행물

디지털융복합연구 KCI 등재 바로가기
권호(발행년)

제12권 제6호 (2014.06) 바로가기
페이지

pp.323-332
저자

Yong-Bae Lee
언어

영어(ENG)
URL

https://www.earticle.net/Article/A220384

원문정보

초록

한국어: 일반적으로 자동분류는 학습문서의 개수에 영향을 받는다고 알려져 있지만 실제로 학습문서의 수가 텍스트 자동분류에 어떻게 영향을 주는지 입증한 연구는 거의 없었다. 본 연구에서는 학습문서 수가 자동분류에 어떤 영향을 주는지 알아보기 위해 최근에 개발된 편차기반 분류방법을 중심으로 다른 분류 알고리즘과 비교하는데 초점을 두었다. 실험결과, 편차기반 분류모델은 학습문서의 수가 총 21개(7개 장르)인 상황에서 정확도가 0.8로 베이지안이나 지지벡터기계보다 우수하게 나타났다. 이것은 편차기반 분류모델이 장르내의 주제정보를 이용하여 학습하기 때문에 학습문서의 수가 적더라도 다른 학습방법보다 좋은 자질 선택 능력을 갖는다는 것을 입증한 것이다.

영어: It is generally accepted that classification accuracy is affected by the number of learning documents, but there are few studies that show how this influences automatic text classification. This study is focused on evaluating the deviation-based classification model which is developed recently for genre-based classification and comparing it to other classification algorithms with the changing number of training documents. Experiment results show that the deviation-based classification model performs with a superior accuracy of 0.8 from categorizing 7 genres with only 21 training documents. This exceeds the accuracy of Bayesian and SVM. The Deviation-based classification model obtains strong feature selection capability even with small number of training documents because it learns subject information within genre while other methods use different learning process.

Abstract
요약
1. Introduction
  1.1 Aim of the study
  1.2 Related works
2. Classification models for evaluation
  2.1 Deviation-based classification method
  2.2 Other classification methods
  2.3 Testing environment
3. Classification accuracy by the number of training documents
  3.1 Construction of the documents set
  3.2 Classification accuracy for the levels of training
4. Conclusion
REFERENCES

저자

Yong-Bae Lee [ 이용배 | Dept. of Computer Education, Jeonju National University of Education ] Corresponding Author

참고문헌

자료제공 : 네이버학술정보

간행물 정보

간행물

디지털융복합연구 [Journal of Digital Convergence]
간기
계간
pISSN
2713-6434
eISSN
2713-6442
수록기간
2003~2026
등재여부
KCI 등재후보
십진분류
KDC 569 DDC 620

Earticle

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents
학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도

원문정보

초록

목차

저자

참고문헌

간행물 정보

Earticle

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents 학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도

원문정보

초록

목차

저자

참고문헌

간행물 정보

Classification Accuracy by Deviation-based Classification Method with the Number of Training Documents
학습문서의 개수에 따른 편차기반 분류방법의 분류 정확도