Earticle

다운로드

Classification and Comparative Analysis of LIME-based Machine Learning Models

원문정보

초록

영어
This study compares and analyzes the performance of LIME-based machine learning methods (Gaussian Naive Bayes (GNB), Highly-Efficient Logistic Regression (LR), Linear Support Vector Machine (SVM), and Triple-layer Neural Network (TNN)) using three medical datasets. High-dimensional data increases the likelihood of overfitting in learning algorithms due to the curse of dimensionality. To address this, LIME is utilized to compute the importance of key features contributing to the model's predictions. Based on this, features are selected. The LIME technique generates multiple samples by perturbing the data in the local region. Subsequently, a simple linear model is used to evaluate the impact of each feature on the predictions. Features with high importance derived from this process are selected for model retraining. As a result, it was confirmed that learning time could be reduced while maintaining or even improving performance with a smaller number of features. Consequently, by selecting necessary features, the curse of dimensionality issue is alleviated, and accuracy can be maintained or improved using fewer features in the Hepatitis C Prediction Dataset, Breast Cancer Wisconsin (Prognostic) Dataset, and Glioma Grading Clinical and Mutation Features Dataset.

목차

Abstract
I. INTRODUCTION
II. LIME
III. MACHINE LEARNING MODELS AND LIME-BASED FEATURE SELECTION METHODS
A. Gaussian Naive Bayes (GNB)
B. Highly-Efficient Logistic Regression (LR)
C. Linear Support Vector Machine (SVM)
D. Triple-layer Neural Network (TNN)
E. LIME-based Machine Learning Method
IV. EXPERIMENTS AND RESULTS ANALYSIS
V. CONCLUSION
ACKNOWLEDGMENT
REFERENCE

저자

  • Won-Young Jo [ Department of Electronics Engineering, Chosun University Gwangju, South Korea ]
  • Chan-Uk Yeom [ Division of AI Convergence College Chosun University Gwangju, South Korea ]
  • Keun-Chang Kwak [ Department of Electronics Engineering, Chosun University Gwangju, South Korea ]

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      한국차세대컴퓨팅학회 학술대회
    • 간기
      반년간
    • 수록기간
      2021~2025
    • 십진분류
      KDC 566 DDC 004