Fine-tuning and Evaluation of LLaMA Models for Correcting Korean Particle Substitution Errors in Beginner Vietnamese Learners - Focusing on eun/neun (은/는), i/ka (이/가), e (에), and eso (에서)
Korean grammatical particles present a persistent challenge for Vietnamese learners due to fundamental syntactic differences between the two languages. Vietnamese lacks case-marking particles, often leading to substitution errors involving eun/ neun (은/는), i/ka (이/가), e (에), and eso (에서). Traditional teaching methods offer limited success in addressing these issues. Motivated by the need for more adaptive and learner-sensitive solutions, this paper explores the fine-tuning of the LLaMA 3.2.1B language model to correct Korean particle substitution errors commonly made by beginner Vietnamese learners. A custom dataset was developed by generating simulated learner errors based on authentic sentence structures. The model was fine-tuned using Low-Rank Adaptation (LoRA) and instruction-based prompts to ensure efficiency and contextual accuracy. Evaluation on a 5,800-sentence test set demonstrated a sentence-level accuracy of 91.15%, compared to just 8.36% for the pre-trained baseline. With appropriate fine-tuning, these results endorse the capacity of large language models for providing sound grammatical corrections that are personally suited to the requirements of the learners. This technology exhibits promising potential for intelligent tutoring systems in facilitating one-to-one, real-time feedback in second language learning environments.
목차
Abstract 1. INTRODUCTION 2. RELATED WORK AND BACKGROUND 2.1 Korean Particle Errors Among Vietnamese Learners 2.2 Limitations of Traditional Teaching Methods 2.3 Transitioning to LLaMA: From Traditional GEC to Learner-Focused Fine-Tuning 3. METHODOLOGY 3.1 Workflow 3.2 Method Pipeline 4. EXPERIMENTS 4.1 Preparing a Dataset to Fine-tune LLaMA 4.2 Preparing a Dataset to Fine-tune LLaMA data 4.3 Parameter-Efficient Fine-tuning via LoRA 5. RESULTS AND DISCUSSION 5.1 Training Dynamics: Analysis of Training and Validation Loss 5.2 Manual Testing and Evaluation of Model Performance Across Sentences of Varying Lengths ACKNOWLEDGEMENT REFERENCES
키워드
Grammatical ParticleLLaMA ModelNatural Language ProcessingIntelligent Tutoring SystemsGrammar Error Correction.
저자
Linh Pham Thi Dieu [ Researcher, Dept. of Digital Media, Soongsil Univ., Korea ]
Kang-Hee Lee [ Prof., Dept. of Digital Media, Soongsil Univ., Korea ]
Corresponding Author
국제문화기술진흥원 [The International Promotion Agency of Culture Technology]
설립연도
2009
분야
공학>공학일반
소개
본 진흥원은 문화기술(Culture Technology) 관련 산·학·연·관으로 구성된 비영리 단체이다. 문화기술(CT)은 정보통신기술(ICT), 문화적 사고 기반의 예술, 인문학, 디자인, 사회과학기술이 접목된 신융합기술(New Convergence Technology, NCT)로 정의한다. 인간의 삶의 질을 향상시키고, 진보된 방향으로 변화시키고, 문화기술 관련 분야의 학술 및 기술의 발전과 진흥에 공헌하기 위하여, 제3조의 필요한 사업을 행함을 그 목적으로 한다.
간행물
간행물명
International Journal of Advanced Culture Technology(IJACT)
간기
계간
pISSN
2288-7202
eISSN
2288-7318
수록기간
2013~2025
등재여부
KCI 등재
십진분류
KDC 600DDC 700
이 권호 내 다른 논문 / International Journal of Advanced Culture Technology(IJACT) Volume 13 Number 2