Time is an important dimension of information space. It plays important roles in Web search, because most Web pages contain time information and many Web queries are time-related. Therefore, exploiting temporal information in Web pages has been a hotspot in the research on Web search. In this paper, we focus on the time-enhanced topic clustering issue for news search results. Traditional clustering algorithms are usually based on the common phrases of Web pages, and they have little consideration about using the temporal information of Web pages. From this perspective, we propose a time-enhanced topic clustering algorithm for news Web pages. It improves traditional algorithms which only consider textual clustering, and applies a temporal clustering procedure on the topics returned by a textual clustering algorithm, which is to arrange every Web page in a cluster along a timeline based on the update time in Web pages. We conduct experiments on a real dataset crawled from Google News, and compare our algorithm with other competitors including K-Means, STC, TFIC, and Minhash Clustering in terms of different metrics such as precision and recall. The experimental results show that the proposed algorithm has better performance under both offline and online clustering test.
보안공학연구지원센터(IJDTA) [Science & Engineering Research Support Center, Republic of Korea(IJDTA)]
설립연도
2006
분야
공학>컴퓨터학
소개
1. 보안공학에 대한 각종 조사 및 연구
2. 보안공학에 대한 응용기술 연구 및 발표
3. 보안공학에 관한 각종 학술 발표회 및 전시회 개최
4. 보안공학 기술의 상호 협조 및 정보교환
5. 보안공학에 관한 표준화 사업 및 규격의 제정
6. 보안공학에 관한 산학연 협동의 증진
7. 국제적 학술 교류 및 기술 협력
8. 보안공학에 관한 논문지 발간
9. 기타 본 회 목적 달성에 필요한 사업
간행물
간행물명
International Journal of Database Theory and Application
간기
격월간
pISSN
2005-4270
수록기간
2008~2016
십진분류
KDC 505DDC 605
이 권호 내 다른 논문 / International Journal of Database Theory and Application Vol.5 No.4