A large number of electronic documents are labeled using human-interpretable annotations. High-efficiency text mining on such data set requires generative model that can flexibly comprehend the significant of observed labels while simultaneously uncovering topics within unlabeled documents. This paper presents a novel and generalized on-line labeled topic model based on global and local topic (GL-OLT) tracking the time evolution of topics in a sequentially organized multi-labeled corpus. GL-OLT topic model has an incrementally update principle based on time slices by an on-line fashion, and each label has not only a set of local topics, but also has several global topics. Empirical results are presented to demonstrate significant improvements accuracy of label predictive, and lower perplexity and high performance of our proposed model when compared with other models.
목차
Abstract 1. Introduction 2. Methodology 2.1. Modeling Documents with Topics 2.2. GL-OLT Model 3. Approximate Variation Inference 4. Experiments 4.1. Perplexity for Different Models 4.2. Classification Accuracy 4.3. Algorithmic Efficiency 5. Summary References
키워드
Text Information ProcessingLatent Dirichlet Allocation (LDA)Topic ModelingNatural Language Processing
저자
YongHeng Chen [ College of Computer Science, Minnan Normal University, zhangzhou 363000, China ]
Response author
Yaojin Lin [ College of Computer Science, Minnan Normal University, zhangzhou 363000, China ]
Hao Yue [ College of Computer Science, Minnan Normal University, zhangzhou 363000, China ]
보안공학연구지원센터(IJHIT) [Science & Engineering Research Support Center, Republic of Korea(IJHIT)]
설립연도
2006
분야
공학>컴퓨터학
소개
1. 보안공학에 대한 각종 조사 및 연구
2. 보안공학에 대한 응용기술 연구 및 발표
3. 보안공학에 관한 각종 학술 발표회 및 전시회 개최
4. 보안공학 기술의 상호 협조 및 정보교환
5. 보안공학에 관한 표준화 사업 및 규격의 제정
6. 보안공학에 관한 산학연 협동의 증진
7. 국제적 학술 교류 및 기술 협력
8. 보안공학에 관한 논문지 발간
9. 기타 본 회 목적 달성에 필요한 사업
간행물
간행물명
International Journal of Hybrid Information Technology
간기
격월간
pISSN
1738-9968
수록기간
2008~2016
십진분류
KDC 505DDC 605
이 권호 내 다른 논문 / International Journal of Hybrid Information Technology Vol.8 No.12