With the recent introduction of artificial intelligence (AI) technology, the use of data is rapidly increasing, and newly generated data is also rapidly increasing. In order to obtain the results to be analyzed based on these data, the first thing to do is to classify the data well. However, when classifying data, if only one classification technique belonging to the machine learning technique is applied to classify and analyze it, an error of overfitting can be accompanied. In order to reduce or minimize the problems caused by misclassification of the classification system such as overfitting, it is necessary to derive an optimal classification by comparing the results of each classification by applying several classification techniques. If you try to interpret the data with only one classification technique, you will have poor reasoning and poor predictions of results. This study seeks to find a method for optimally classifying data by looking at data from various perspectives and applying various classification techniques such as LDA and QDA, such as linear or nonlinear classification, as a process before data analysis in data analysis. In order to obtain the reliability and sophistication of statistics as a result of big data analysis, it is necessary to analyze the meaning of each variable and the correlation between the variables. If the data is classified differently from the hypothesis test from the beginning, even if the analysis is performed well, unreliable results will be obtained. In other words, prior to big data analysis, it is necessary to ensure that data is well classified to suit the purpose of analysis. This is a process that must be performed before reaching the result by analyzing the data, and it may be a method of optimal data classification.
목차
Abstract 1. Introduction 2. Machine Learning Definition 3. Data Mining Definition 3.1. Features of Data Mining 3.2. Discriminant Analysis 3.3. Analysis Method of Discriminant Analysis 3.4. Analysis stage of Discriminant Analysis 3.5. Classification Model Verification of DA 4. Discriminant Analysis Experiment 4.1. Subject and Method of Experiment 4.2. Discriminant Analysis Using LDA 4.3. Discriminant Analysis Using QDA 5. Conclusion References
키워드
Machine LearningLDAQDASVMClassification Analysis
저자
SeungJae Kim [ Department of Convergence, Honam University, Gwangju ]
SungHwan Kim [ National Program of Excellence in Software center, Chosun University, Gwangju ]
Corresponding Author
조선대학교 기초과학연구원 [The Natural Science Research Institute of Chosun]
설립연도
2008
분야
자연과학>자연과학일반
소개
본 연구원은 기초과학을 진흥하기 위한 연구·교육 및 그 보급을 목적으로 한다. 이 목적을 달성하기 위하여 다음 각 호의 사업을 수행한다.
1. 기초과학 제 분야에 관한 조사와 연구
2. 기초과학에 관한 학술행사(학술대회, 학술세미나, 심포지엄, 초청강연회 등) 개최
3. 학문후속세대 및 일반인을 위한 기초과학 교육
4. 기관지『조선자연과학논문지』 발간
5. 『자연과학연구총서』, 『자연과학번역총서』 등 단행본 발간
6. 기타 본 연구원의 목적과 관련된 사업
간행물
간행물명
통합자연과학논문집(구 조선자연과학논문집) [Journal of Integrative Natural Science]