The 9th International Conference on Next Generation Computing 2023 (2023.12)바로가기
페이지
pp.229-232
저자
Beomjo Kim, Sangjin Ahn, Kyung-Ah Sohn
언어
영어(ENG)
URL
https://www.earticle.net/Article/A448156
원문정보
초록
영어
This work presents a novel fine-tuning scheme for enhancing the quality of Subject Driven Image Generation. Motivated by recent works on fine-tuning pre-trained diffusion models, we extract information from visual patch embedding to optimize the performance of the image encoder in our proposed method. Additionally, the loss function of the conventional Unet model is replaced with Masked Diffusion Loss. During inference time, the model can control degree of similarity between result image and reference image by using Classifier - Free Guidance method. Experimental results indicate that the proposed model exhibits improved image generation quality in comparison to the previous schemes.
목차
Abstract I. INTRODUCTION II. RELATED WORK III. METHOD A. Image Encoder B. Model Training Strategy IV. EXPERIMENT V. RESULTS AND DISCUSSION VI. CONCLUSION ACKNOWLEDGMENT REFERENCES