계층형 이미지-텍스트 멀티모달 모델을 활용한 국내 스타트업-해외 바이어간 B2B 추천 성능 향상 연구
A Study on Improving B2B Recommendation Performance between Korean Startups and Overseas Buyers using a Hierarchical Image-Text Multimodal Model
This study aims to solve the 'semantic interference' problem that occurs in B2B recommender systems when processing over 17,000 product categories within a single embedding space. To address this, we propose a two-stage Hierarchical Multimodal Recommendation Model (Hierarchical Model). The proposed model first predicts one of 238 mid-level categories in the first stage, and then performs detailed item recommendation on a refined candidate pool within that category in the second stage. Comparative experiments against a single model (Baseline Model) using the same CLIP-based architecture showed that the Hierarchical Model achieved a consistent and significant performance improvement of 4.5-5.0%p on average across all key metrics, including Precision@K, MAP@K, and NDCG@K. This paper empirically demonstrates that this performance enhancement is not merely due to search space reduction, but is based on clear theoretical justifications: (1) 'semantic disambiguation' through the specialization of embedding spaces, (2) the efficiency of a 'learned cascade' structure, and (3) the mitigation of data sparsity via the 'information bottleneck' principle.
목차
Abstract 1. 서론 2. 관련 연구 2.1 B2B 환경에서의 추천 시스템 2.2 멀티모달 기반 추천 모델 2.3 계층적 구조를 활용한 카테고리 기반 추천 3. 연구 방법 3.1 연구 개요 3.2 데이터 전처리 3.3 계층형 멀티모달 추천 모델 설계 4. 실험 및 결과 4.1 실험 환경 설정 4.2 모델 성능 평가 5. 결론 및 향후 연구 5.1 결론 5.2 향후 연구 References