Earticle

현재 위치 Home

Convergence of Internet, Broadcasting and Communication

Development of Tourism Information Named Entity Recognition Datasets for the Fine-tune KoBERT-CRF Model

첫 페이지 보기
  • 발행기관
    국제인공지능학회(구 한국인터넷방송통신학회) 바로가기
  • 간행물
    International Journal of Internet, Broadcasting and Communication 바로가기
  • 통권
    Vol.14 No.2 (2022.05)바로가기
  • 페이지
    pp.55-62
  • 저자
    Myeong-Cheol Jwa, Jeong-Woo Jwa
  • 언어
    영어(ENG)
  • URL
    https://www.earticle.net/Article/A412508

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

원문정보

초록

영어
A smart tourism chatbot is needed as a user interface to efficiently provide smart tourism services such as recommended travel products, tourist information, my travel itinerary, and tour guide service to tourists. We have been developed a smart tourism app and a smart tourism information system that provide smart tourism services to tourists. We also developed a smart tourism chatbot service consisting of khaiii morpheme analyzer, rule-based intention classification, and tourism information knowledge base using Neo4j graph database. In this paper, we develop the Korean and English smart tourism Name Entity (NE) datasets required for the development of the NER model using the pre-trained language models (PLMs) for the smart tourism chatbot system. We create the tourism information NER datasets by collecting source data through smart tourism app, visitJeju web of Jeju Tourism Organization (JTO), and web search, and preprocessing it using Korean and English tourism information Name Entity dictionaries. We perform training on the KoBERT-CRF NER model using the developed Korean and English tourism information NER datasets. The weight-averaged precision, recall, and f1 scores are 0.94, 0.92 and 0.94 on Korean and English tourism information NER datasets.

목차

Abstract
1. Introduction
2. The Korean and English Tourism Information NER Datasets
2.1 Source data of smart tourism NER datasets
2.2 Tourism information Name Entity BIO tagging dictionary
2.3 Pre-processing for tourism information NER data generation
3. Tourism Information NER Performance of the KoBERT-CRF NER model
4. Conclusions and Further Study
Acknowledgement
References

키워드

Tourism Information NER Smart Tourism Chatbot KoBERT model Conditional Random Fields (CRF) pre-trained language models (PLMs).

저자

  • Myeong-Cheol Jwa [ Student, Korea University of Technology and Education, Cheonan-si, Korea ]
  • Jeong-Woo Jwa [ Professor, Department of Telecommunication Eng., Jeju National University, Jeju, Korea ] Corresponding author

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

  • 발행기관명
    국제인공지능학회(구 한국인터넷방송통신학회) [The International Association for Artificial Intelligence]
  • 설립연도
    2000
  • 분야
    공학>전자/정보통신공학
  • 소개
    인터넷방송, 인터넷 TV , 방송 통신 네트워크 및 관련 분야에 대한 국내는 물론 국제적인 학술, 기술의 진흥발전에 공헌하고 지식 정보화 사회에 기여하고자 한다.

간행물

  • 간행물명
    International Journal of Internet, Broadcasting and Communication
  • 간기
    계간
  • pISSN
    2288-4920
  • eISSN
    2288-4939
  • 수록기간
    2009~2025
  • 십진분류
    KDC 326 DDC 380

이 권호 내 다른 논문 / International Journal of Internet, Broadcasting and Communication Vol.14 No.2

    피인용수 : 0(자료제공 : 네이버학술정보)

    함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

      페이지 저장