Comparing Embedding-Based Approaches for Complex Emotion Detection in Online Comments:

Oral Session B-3 : Biomedical Applications

간행물

한국차세대컴퓨팅학회 학술대회 바로가기
권호(발행년)

ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 (2025.12) 바로가기
페이지

pp.284-286
저자

Jiwon Kim, Eungkyo Suh
언어

영어(ENG)
URL

https://www.earticle.net/Article/A478514

영어: This study compares two embedding-based natural language processing techniques—Sentence-BERT (SBERT) combined with HDBSCAN clustering and BERTopic modeling—for detecting complex emotions in short Korean online comments. Using 33,531 comments collected from a YouTube relationship counseling channel, we examined how each method captures nuanced and overlapping sentiments such as affection, avoidance, and conflict. Both models used identical SBERT embeddings and UMAP-based dimensionality reduction, and their clustering performance was quantitatively evaluated using Silhouette Score, Davies–Bouldin Index (DBI), and Calinski–Harabasz Index (CHI). The results show that BERTopic achieved higher coherence and clearer topic boundaries (Silhouette = 0.40, DBI = 0.85, CHI = 15,157) compared to SBERT–HDBSCAN (Silhouette = –0.23, DBI = 1.49, CHI = 1,230). Although both methods yielded high noise ratios due to the leaf-based density clustering, BERTopic effectively reclassified semantically relevant comments through its ClassTF-IDF weighting, improving topic stability and interpretability. These findings suggest that BERTopic provides superior performance for analyzing short, emotion-rich Korean text and offers methodological insight for future sentiment analysis research. This electronic document is a “live” template and already defines the components of your paper [title, text, heads, etc.] in its style sheet.

Jiwon Kim [ Data and Knowledge Service Engineering Dankook University Gyeonggi-do, South Korea ]
Eungkyo Suh [ Data and Knowledge Service Engineering Dankook University Gyeonggi-do, South Korea ]

자료제공 : 네이버학술정보

Earticle