AI-HUB 중국어 인공지능 학습용 데이터에 대한 품질 검증 방안 연구- ‘방송 콘텐츠 한-중 번역 병렬 말뭉치 데이터’를 중심으로
A Study on Quality Verification of AI-HUB Chinese Artificial Intelligence Learning Data - Focusing on -Broadcast Content Korea-China Translation Parallel Corpus Data-
This study proposes a systematic methodology for evaluating Korean-Chinese translation quality using AI-HUB's ‘Broadcasting Content Korean-Chinese Translation Parallel Corpus Data’. While machine translation technology has advanced significantly, research on Korean-Chinese translation characteristics remains insufficient, particularly for domains featuring extensive colloquial expressions and cultural contexts such as broadcasting content.
This research selected 10,000 sentences through stratified sampling from 1.2 million sentences and evaluated translation quality using BLEU, METEOR, and TER metrics. The analysis revealed that translation quality varied significantly by genre and sentence length. Educational programs achieved the highest BLEU score of 0.467, while reality variety shows recorded the lowest at 0.371. Translation quality declined sharply as sentence length increased, from 0.518 for short sentences to 0.287 for long sentences. Error analysis identified colloquial expression mistranslations (32%), demonstrative errors (21%), and literal translations of idioms (18%) as major challenges.
The methodology established reproducibility through publicly available resources (AI-HUB data, Naver Papago API, Python/NLTK), while findings suggest that educational content shows higher translation reliability compared to entertainment programs requiring careful post-editing.
This study is significant as the first systematic evaluation of Korean-Chinese translation quality in the broadcasting content domain, providing specific directions for improving translation of colloquial expressions and cultural context that are characteristic of Chinese language translation.
동북아시아문화학회 [The Association of North-east Asian Cultures]
설립연도
2000
분야
복합학>학제간연구
소개
동북아시아 문화의 다양성과 정체성을 연구 토론하고, 지역내 문화 교류의 다양한 모습을 연구하고 문화변동의 큰 틀을 집적함으로써 우리 민족 문화 및 상대 민족의 문화적 터전을 이해하여 문화공동체적 특성을 계발하고 상호 관련성의 강화를 유도하는 학술활동을 통해 동북아시아의 문화발전에 이바지함.