2005년도 6th 2005 International Conference on Computers, Communications and System (2005.11)바로가기
페이지
pp.21-25
저자
Kang, Sin-Jae, Kim, Jong-Wan
언어
영어(ENG)
URL
https://www.earticle.net/Article/A166139
※ 기관로그인 시 무료 이용이 가능합니다.
※ 학술발표대회집, 워크숍 자료집 중 4페이지 이내 논문은 '요약'만 제공되는 경우가 있으니, 구매 전에 간행물명, 페이지 수 확인 부탁 드립니다.
4,000원
원문정보
초록
영어
In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the phase. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.
목차
Abstract 1. Introduction 2. Training Phase 2.1. Definite Information 2.2. Less Definite Information 2.3. Kadokawa Thesaurus 2.4. Constructing Feature Vectors 3. Applying Phase 4. Experiments 5. Conclusion References
키워드
information filtering; spam-mail filtering; conceptual information; spam recall; thesaurus
저자
Kang, Sin-Jae [ School of Computer and Information Technology, Daegu University ]
Kim, Jong-Wan [ School of Computer and Information Technology, Daegu University ]