Earticle

현재 위치 Home

Text Classification Using Parallel Word-level and Character-level Embeddings in Convolutional Neural Networks

첫 페이지 보기
  • 발행기관
    한국경영정보학회 바로가기
  • 간행물
    Asia Pacific Journal of Information Systems KCI 등재 SCOPUS 바로가기
  • 통권
    제29권 제4호 (2019.12)바로가기
  • 페이지
    pp.771-788
  • 저자
    Geonu Kim, Jungyeon Jang, Juwon Lee, Kitae Kim, Woonyoung Yeo, Jong Woo Kim
  • 언어
    영어(ENG)
  • URL
    https://www.earticle.net/Article/A367257

※ 기관로그인 시 무료 이용이 가능합니다.

5,200원

원문정보

초록

영어
Deep learning techniques such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) show superior performance in text classification than traditional approaches such as Support Vector Machines (SVMs) and Naïve Bayesian approaches. When using CNNs for text classification tasks, word embedding or character embedding is a step to transform words or characters to fixed size vectors before feeding them into convolutional layers. In this paper, we propose a parallel word-level and character-level embedding approach in CNNs for text classification. The proposed approach can capture word-level and character-level patterns concurrently in CNNs. To show the usefulness of proposed approach, we perform experiments with two English and three Korean text datasets. The experimental results show that character-level embedding works better in Korean and word-level embedding performs well in English. Also the experimental results reveal that the proposed approach provides better performance than traditional CNNs with word-level embedding or character-level embedding in both Korean and English documents. From more detail investigation, we find that the proposed approach tends to perform better when there is relatively small amount of data comparing to the traditional embedding approaches.

목차

ABSTRACT
Ⅰ. Introduction
Ⅱ. Related Work
2.1. Text Classification
2.2. Deep Learning in Text Mining
2.3. Convolutional Neural Networks (CNNs) for Text Classification
Ⅲ. Proposed Approach
3.1. Hyperparameters Configuration
3.2. Word Vector and Character Vector
3.3. Regularization and Normalization
Ⅳ. Experimental Design and Datasets
4.1. Comparing Models
4.2. Datasets
Ⅴ. Results and Discussion
5.1. Comlementary Effect
5.2. Size Effect
5.3. Possibility of Improvement through Hyperparameter and Embedding optimization
5.4. Implications
Ⅵ. Conclusion and Future Work
Acknowledgements

키워드

Word-level Embedding Character-level Embedding Convolutional Neural Network Text Classification

저자

  • Geonu Kim [ Undergraduate Student, School of Business, Hanyang University, Korea ]
  • Jungyeon Jang [ Manager, Hyundai Motor Company, Korea ]
  • Juwon Lee [ Analyst, Korea Ratings, Korea ]
  • Kitae Kim [ Researcher, Hana Institute of Finance, KEB Hana Bank, Korea ]
  • Woonyoung Yeo [ M.S. Student, Business Informatics from Graduate School, Hanyang University, Korea ]
  • Jong Woo Kim [ Professor, School of Business, Hanyang University, Korea ] Corresponding author

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

  • 발행기관명
    한국경영정보학회 [The Korea Society of Management information Systems]
  • 설립연도
    1989
  • 분야
    사회과학>경영학
  • 소개
    이 학회는 경영정보학의 연구 및 교류를 촉진하고 학문의 발전과 응용에 공헌함을 목적으로 합니다.

간행물

  • 간행물명
    Asia Pacific Journal of Information Systems
  • 간기
    계간
  • pISSN
    2288-5404
  • eISSN
    2288-6818
  • 수록기간
    1990~2026
  • 등재여부
    KCI 등재,SCOPUS
  • 십진분류
    KDC 325 DDC 658

이 권호 내 다른 논문 / Asia Pacific Journal of Information Systems 제29권 제4호

    피인용수 : 0(자료제공 : 네이버학술정보)

    함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

      페이지 저장