고용 빅데이터에서 결과 변수의 계층 불균형 문제를 해결하기 위한 조건부 표 형식의 생성적 적대적 네트워크(GAN)의 응용
Application of Conditional Tabular Generative Adversarial Networks (GAN) for Addressing Class Imbalance in Nationwide Employment Big Data
This study investigates using Conditional Tabular Generative Adversarial Networks (CT-GAN) to generate synthetic data for turnover prediction in large employment datasets. The effectiveness of CT-GAN is compared with Adaptive Synthetic Sampling (ADASYN), Synthetic Minority Over-sampling Technique (SMOTE), and Random Oversampling (ROS) using Logistic Regression (LR), Linear Discriminant Analysis (LDA), Random Forest (RF), and Extreme Learning Machines (ELM), evaluated with AUC and F1-scores. Results show that GAN-based techniques, especially CT-GAN, outperform traditional methods in addressing data imbalance, highlighting the need for advanced oversampling methods to improve classification accuracy in imbalanced datasets.
목차
Abstract 1. Introduction 2. Related works 3. Materials and Methods 3.1. Imbalance Ratio (IR) 3.2. Random Oversampling (ROS) 3.3. Synthetic Minority Over-Sampling Technique (SMOTE) 3.4. B-SMOTE 3.5. Adaptive Synthetic Sampling (ADASYN) 3.6. Conditional GAN (CGAN) 3.7. Conditional Tabular GAN (CT-GAN) 3.8. Modeling 3.9. Data source 3.10. Experimental design 3.11. Performance Evaluation Methods and Metrics 4. Results 5. Discussion 6. Conclusions References
키워드
산업 재해직장 복귀 예측무작위 언더샘플링 부스팅예측 모델링Industrial AccidentReturn-to-Work PredictionRUSBoostPredictive Modeling
저자
변해원 [ Haewon Byeon | Department of AI-Software, Inje University, South Korea. ]
Corresponding Author