トピックモデルを用いた日本語テキスト マイニングの研究 －旧JLPTの読解の既出問題に対する分析を中心に－

金曘泳

216.73.216.223

개인회원 가입

개인회원
기관회원

개인회원 로그인

개인회원 가입으로 더욱 편리하게 이용하세요. 개인회원 가입

아이디/비밀번호를 잊으셨나요? 아이디/비밀번호 찾기

기관회원 로그인

소속기관에서 검색되지 않는 기관은 무료원문다운이 불가능합니다. 개인회원 가입 후 유료구매를 하시거나 소속기관 도서관에 이용문의해 주세요.

Home

日本言語

トピックモデルを用いた日本語テキストマイニングの研究－旧JLPTの読解の既出問題に対する分析を中心に－
Research on Japanese text mining using the Topic Model - Focusing on the analysis of past test reading comprehension questions in the previous format of JLPT -

발행기관

한국일본언어문화학회 바로가기
간행물

일본언어문화 KCI 등재 바로가기
통권

제61집 (2022.12)바로가기
페이지

pp.27-46
저자

金曘泳
언어

일본어(JPN)
URL

https://www.earticle.net/Article/A421820

※ 기관로그인 시 무료 이용이 가능합니다.

5,500원

원문정보

초록

영어: In this paper, as one of the attempts to effectively utilize the vast amount of text data, I have introduced a text mining technique called Topic Model into the field of Japanese studies. Concretely, the texts of the reading comprehension parts of the previous format JLPT for the past 20 years were collected, and Topic Model analysis was carried out. The following points were made clear by such a study. First of all, it was confirmed from actual data that the subjects of the previous format JLPT tried to avoid topic-specific biases when selecting and producing the texts for the questions. Next, the text can be statistically classified into four main topics: “Private relationships such as family and work,” “Communications related to schedules,” “Public relations related to the country and society,” and “Economic activity.” The techniques and results of topic model analysis in this paper were empirical analyzes of actual existing questions. It is considered significant in that it can be applied to all fields of Japanese studies that are needed. Of course, the discussion in this paper is limited to the texts of the previous format JLPT, not the new format JLPT, and the amount of data is relatively small, although it covers all the data for the past 20 years. In addition, a comparative analysis with other texts was not possible. Therefore, it seems that there is still room for improvement in this paper, but I would like to address this as a future issue.

1. はじめに
2. 先行研究
2.1 テキストマイニング(Text Mining)
2.2 トピックモデル(Topic Model)
3. 研究方法
3.1 データの収集
3.2 データの前処理
3.3 データの分析
4. 分析結果
4.1. LDAにおけるトピックの数の設定
4.2. LDAによるトピックモデルリング
4.3. トピック分析
5. おわりに
参考文献

키워드

text mining topic model latent dirichlet allocation python genshim テキストマイニングトピックモデル LDA パイソンゲンシム

저자

金曘泳 [ 김유영 | 同徳女子大学日本語学科副教授 ]

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

발행기관명

한국일본언어문화학회 [Japanese Language & Culture Association of Korea]
설립연도
2001
분야
인문학>일본어와문학
소개
본 학회는 일본어학 및 일본문학은 물론, 일본의 정치, 경제, 문화, 사회 등의 일본학 전반에 걸친 연구 및 일본의 언어, 문화를 매체로 한 한국과의 비교 연구를 대상으로 하고 있다. 본 학회는 회원들에게 연구 발표 및 정보 교환의 기회를 부여하고 나아가 한국에서의 바람직한 일본 연구 자세를 확립하는 것을 주된 목표로 하고 있다.

간행물

간행물명

일본언어문화 [Journal of japanese Language and Culture]
간기
계간
pISSN
1598-9585
수록기간
2002~2025
등재여부
KCI 등재
십진분류
KDC 730 DDC 495

이 권호 내 다른 논문 / 일본언어문화 제61집

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

출처 : 네이버학술정보

0개의 논문이 장바구니에 담겼습니다.

페이지 저장

소속기관 조회

이용자님의 소속기관(단체)이 서비스에 가입되어 있는지 확인해 보십시오.
기관회원에 소속되어 있는 이용자는 원문을 무료로 이용할 수 있습니다.

상호: 주식회사 학술교육원 I 대표: 노방용 I 사업자등록번호: 122-81-88227 I 통신판매업신고번호: 제2008-인천부평-00176호 I 정보보호책임자: 이두영
주소: (21319)인천광역시 부평구 영성중로 50 미래타워 701호 I 전화: 0505-555-0740 I 팩스: 0505-555-0741 I 이메일: earticle@earticle.net

음성지원 및 돋보기 서비스

Earticle