The 8th International Conference on Next Generation Computing 2022 (2022.10)바로가기
페이지
pp.118-121
저자
Yi Ren, Xin He
언어
영어(ENG)
URL
https://www.earticle.net/Article/A419754
원문정보
초록
영어
In this paper, we propose a new framework that enables an object detector trained with only point-level annotations to estimate the centroids and sizes of objects in dense scenes. Specifically, the framework is based on the Swin Transformer structure and introduces a self-designed resolution feature fusion module in the hierarchical structure, where the estimation of object centroids is done directly by point supervision, and the object pseudo-size is initialized based on the assumption of local uniform distribution, and the regression of object size is guided by an improved congestion-aware loss function. In the NWPU-Crowd dataset, our method outperformed the existing state-of-the-art detection counting methods in F1-measure, precision, MSE evaluation criteria.
목차
Abstract I. INTRODUCTION II. METHOD A. Swin Transformer B. Resolution feature fusion module C. Congestion-aware loss function III. EXPERIMENTS A. Evaluation Criteria B. Dataset C. Parameter Setting D. Ablation experiments E. Experiment results IV. CONCLUSION REFERENCES