Improving Global and Local Feature Extraction with Swin Transformer on Monocular Depth Estimation

Session Ⅰ: Computer Vision and Image Analysis

간행물

한국차세대컴퓨팅학회 학술대회 바로가기
권호(발행년)

The 9th International Conference on Next Generation Computing 2023 (2023.12) 바로가기
페이지

pp.55-56
저자

Yun-Young Chang, Joo-Hee Oh, Abrar Alabdulwahab, Chan-Young Choi, Sang-Woong Lee
언어

영어(ENG)
URL

https://www.earticle.net/Article/A448116

영어: Global-Local Path Network is a monocular depth estimation network. It presents a new method for integrating global features from an encoder and local features from a decoder through a Selective Feature Fusion module. In this paper, we propose that replacing the SegFormer encoder with the Swin Transformer leads to an improved GLPN, called Swin Transformer-Global-Local-Path-Network. We train the network with modified NYU Depth V2 datasets. Therefore, with the 0.034 RMSE, 0.075 AbsRel, 0.033 log10, 0.951 Delta 1, 0.994 Delta 2, 0.999 Delta 3, our network using a tiny version of Swin Transformer outperforms the previous GLPN model.

자료제공 : 네이버학술정보

Earticle