거대언어모델 수학적 성능 개선 방안 탐구 : 파인튜닝을 중심으로

이기마; 김희정

216.73.217.141

개인회원 가입

개인회원
기관회원

개인회원 로그인

개인회원 가입으로 더욱 편리하게 이용하세요. 개인회원 가입

아이디/비밀번호를 잊으셨나요? 아이디/비밀번호 찾기

기관회원 로그인

소속기관에서 검색되지 않는 기관은 무료원문다운이 불가능합니다. 개인회원 가입 후 유료구매를 하시거나 소속기관 도서관에 이용문의해 주세요.

Home

거대언어모델 수학적 성능 개선 방안 탐구 : 파인튜닝을 중심으로
Exploring strategies for enhancing the mathematical performance of large language models : Focusing on fine-tuning

발행기관

한국학교수학회 바로가기
간행물

한국학교수학회논문집 KCI 등재 바로가기
통권

제28권 제1호 (2025.03)바로가기
페이지

pp.65-94
저자

이기마, 김희정
언어

한국어(KOR)
URL

https://www.earticle.net/Article/A466054

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

원문정보

초록

영어: This study investigates whether fine-tuning a Large Language Model (LLM) using a mathematical data set in Korean can enhance its mathematical performance and explores the underlying mechanisms. The findings confirm that fine-tuning with a mathematical data set in Korean improves the mathematical capabilities of LLMs. Specifically, after fine-tuning, the accuracy of solving mathematical problems increased from 65.79% to 81.25%, reflecting a 15.46% improvement. The problem-solving process also showed notable improvements in formalization, computational accuracy, and explanatory capability. Additionally, the generation of irrelevant non-mathematical content and language confusion issues were eliminated. Notably, the changes in the problem-solving process suggest that, through fine-tuning, the LLM learns the solution patterns, development structures, content organization, and formatting embedded in the mathematical dataset in Korean. This improvement in the problem-solving process serves as a key mechanism contributing to the increase in accuracy. However, fine-tuning also introduced challenges, such as continuous text generation and catastrophic forgetting. Based on these findings, this study provides insights into developing domain-specific mathematical LLMs, constructing mathematical data sets, fine-tuning strategies and its associated challenges. Furthermore, six key directions for future research are suggested. To promote reproducibility and further research, the Python code in this study have been made publicly available on the researcher’s GitHub repositories.

한국어: 본 연구는 한국어 수학 데이터셋을 활용한 파인튜닝(Fine-tuning)이 거대언어모델(Large Language Model, LLM)의 수학적 성능을 개선할 수 있는지를 그 메커니즘과 함께 탐구하였다. 그 결과 한국어 수학 데이터셋을 활용한 파인튜닝은 LLM의 수학적 성능을 강화할 수 있음을 확인하였다. 구체적으로, 파인튜닝 후 LLM의 수학 문제 정답률이 65.79%에서 81.25%로 15.46% 상승하였다. 문제 풀이 과정에서는 수식화 성 능, 계산 성능, 풀이 설명력이 크게 강화되었고, 불필요한 비수학적 내용을 생성하는 현상과 언어 혼란(language confusion)이 사라졌다. 특히, 풀이 과정의 변화를 통해 LLM이 파인튜닝을 거치며 수학 데이터셋에 존재하는 풀이 패턴, 내용 전개 패턴, 내용 구조와 형식을 학습할 수 있으며, 이에 따른 풀이 과정의 개선이 정답률 향상에 기여하는 핵심 메커니즘임을 알 수 있었다. 한편, 파인튜닝 후 발생한 문제점으로는 텍스트 무한 생성 현상과 파괴적 망각(catastrophic forgetting)이 관찰되었다. 이러한 결과를 바탕으로 수학 도메인 특화 LLM 개발, 수학 데이터셋 구축, 파인튜닝 의 전략 및 문제점 대응 방안 측면에서 시사점을 논의하였다. 또한, 후속 연구를 위하여 여섯 가지 연구 방향을 제언하였으며, 본 연구에서 구축한 파인튜닝 파이썬 코드를 연구의 재현과 확장을 위해 연구자의 깃허브(Github) 저장소에 공개하였다.

국문요약
Abstract
Ⅰ. 서론
Ⅱ. 이론적 배경
1. 거대언어모델의 파인튜닝(Fine-tuning) 개념
2. 파인튜닝의 성능 개선 효과
3. 파인튜닝의 파괴적 망각(Catastrophic Forgetting) 문제
4. 수학교육에서 파인튜닝의 활용
Ⅲ. 연구 방법
1. 파인튜닝 과정
2. 베이스 모델 및 파인튜닝 모델 평가
Ⅳ. 연구 결과
1. 베이스 모델과 파인튜닝 모델의 정답률
2. 베이스 모델과 파인튜닝 모델 풀이 과정 비교
3. 파인튜닝 후 발생한 문제점
Ⅴ. 결론 및 논의
1. 파인튜닝에 따른 LLM의 정답률 개선
2. 파인튜닝에 따른 LLM의 풀이 과정 개선
3. 파인튜닝에 따른 문제점과 이를 개선하기 위한 전략
Ⅵ. 제언
참고문헌

키워드

거대언어모델 라마 수학적 성능 수학 데이터셋 수학 도메인 특화 모델 파인튜닝 large language model Llama mathematical performance mathematical dataset domain-specific mathematical model fine-tuning

저자

이기마 [ Lee, Gima | 고려대학교 대학원생 ]
김희정 [ Kim, Hee-jeong | 고려대학교 교수 ] Corresponding Author

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

발행기관명

한국학교수학회 [The Korean School Mathematics Society]
설립연도
1998
분야
자연과학>수학
소개
학교수학 분야의 수학교육에 관심이 있거나 수학교육에 직접 종사하는 사람들이 함께 모여서 수학교육에 대한 이론적, 방법론적 연구를 통하여 현직 교사들의 연구의욕을 고취하고 이를 통하여 우리나라 수학교육과 학교수학의 발전을 도모하는데 그 목적을 둔다.

간행물

간행물명

한국학교수학회논문집 [Journal of the Korean School Mathematics Society]
간기
계간
pISSN
1229-0890
수록기간
1998~2025
등재여부
KCI 등재
십진분류
KDC 410 DDC 510

이 권호 내 다른 논문 / 한국학교수학회논문집 제28권 제1호

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

출처 : 네이버학술정보

0개의 논문이 장바구니에 담겼습니다.

페이지 저장

소속기관 조회

이용자님의 소속기관(단체)이 서비스에 가입되어 있는지 확인해 보십시오.
기관회원에 소속되어 있는 이용자는 원문을 무료로 이용할 수 있습니다.

상호: 주식회사 학술교육원 I 대표: 노방용 I 사업자등록번호: 122-81-88227 I 통신판매업신고번호: 제2008-인천부평-00176호 I 정보보호책임자: 이두영
주소: (21319)인천광역시 부평구 영성중로 50 미래타워 701호 I 전화: 0505-555-0740 I 팩스: 0505-555-0741 I 이메일: earticle@earticle.net

음성지원 및 돋보기 서비스

Earticle

거대언어모델 수학적 성능 개선 방안 탐구 : 파인튜닝을 중심으로
Exploring strategies for enhancing the mathematical performance of large language models : Focusing on fine-tuning

원문정보

초록

목차

키워드

저자

참고문헌

간행물 정보

발행기관

간행물

이 권호 내 다른 논문 / 한국학교수학회논문집 제28권 제1호

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

Earticle

거대언어모델 수학적 성능 개선 방안 탐구 : 파인튜닝을 중심으로 Exploring strategies for enhancing the mathematical performance of large language models : Focusing on fine-tuning

원문정보

초록

목차

키워드

저자

참고문헌

간행물 정보

발행기관

간행물

이 권호 내 다른 논문 / 한국학교수학회논문집 제28권 제1호

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

거대언어모델 수학적 성능 개선 방안 탐구 : 파인튜닝을 중심으로
Exploring strategies for enhancing the mathematical performance of large language models : Focusing on fine-tuning