Reference-Guided Automatic Mask Generation with SAM and CLIP for Metaverse Content Editing

Oral Session B-1: Vision Applications

Reference-Guided Automatic Mask Generation with SAM and CLIP for Metaverse Content Editing

간행물

한국차세대컴퓨팅학회 학술대회 바로가기
권호(발행년)

ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 (2025.12) 바로가기
페이지

pp.35-38
저자

Xingshi Gan, Wenqi Zhang, L. Minh Dang, Yue Zhang, Yanan Wang, Seongwook Lee, Hyeonjoon Moon
언어

영어(ENG)
URL

https://www.earticle.net/Article/A478454

원문정보

초록

영어: We introduce a reference-guided, fully automatic mask generation framework that does not rely on textual prompts or manual annotations. The approach first uses Segment Anything Model (SAM) with automatic mask generation (AMG) to produce multiple mask candidates. Each candidate is then scored against the reference image in the CLIP semantic space. A robust Top-K selection with prior reweighting favors plausible regions and suppresses small, off-center, or abnormal aspect-ratio masks. Finally, morphological closing and Gaussian feathering yield refined hard/soft masks that can be directly consumed by inpainting or blending modules. Experiments on a COCO subset and our in-house images show strong performance on segmentation metrics (IoU, Dice) and perceptual measures (FID, LPIPS, CLIP-Score), while avoiding the cost of manual masks. This enables streamlined asset preparation for metaverse content creation, immersive AR/VR scenes, and large-scale digital twins where zero-interaction mask generation is crucial

Abstract
I. INTRODUCTION
II. RELATED WORK
III. METHODOLOGIES
A. Overall Framework
B. Candidate Mask Generation (SAM-AMG)
C. Mask–Reference Similarity (CLIP Matching)
D. Top-K Robust Selection and Union
E. Prior Reweighting
F. Methodologies
G. Extensions and Robustness
IV. EXPERIMENTS
A. Evaluation Setup
B. Quantitative Results
C. Qualitative Results
V. CONCLUSION
Acknowledgment
REFERENCES

저자

Xingshi Gan [ Department of Computer Science & Engineering Sejong University Seoul, Republic of Korea ]
Wenqi Zhang [ Department of Computer Science & Engineering Sejong University Seoul, Republic of Korea ]
L. Minh Dang [ Department of Computer Science & Engineering Sejong University Seoul, Republic of Korea ]
Yue Zhang [ Department of Computer Science & Engineering Sejong University Seoul, Republic of Korea ]
Yanan Wang [ Department of Computer Science & Engineering Sejong University Seoul, Republic of Korea ]
Seongwook Lee [ Department of Artificial Intelligence Data Science Sejong University Seoul, Republic of Korea ]
Hyeonjoon Moon [ Department of Computer Science & Engineering Sejong University Seoul, Republic of Korea ]

참고문헌

자료제공 : 네이버학술정보

간행물 정보

간행물

한국차세대컴퓨팅학회 학술대회
간기
반년간
수록기간
2021~2025
십진분류
KDC 566 DDC 004

Earticle

Reference-Guided Automatic Mask Generation with SAM and CLIP for Metaverse Content Editing

원문정보

초록

목차

저자

참고문헌

간행물 정보