In recent object detection research, there has been a growing focus on Detection Transformers predicting bounding boxes directly. However, Detection Transformers face challenges such as slow convergence and difficulty in detecting small objects. We attribute these issues to the insufficient feature extraction capability of the backbone. Therefore, we employ the high-performing backbone, the Pyramid Pooling Transformer to detection Transformer. However, we observe a problem where, despite rapid initial convergence, the model fails to converge effectively after a certain point in training. We discuss the underlying causes of this issue in this study.
목차
Abstract I. INTRODUCTION II. RELATED WORKS A. P2T B. DETR III. METHOD AND EXPERIEMTNS A. Method B. Dataset C. Evaluation Metrics D. Expreiments Result IV. DISSCUSION ACKNOWLEDGMENT REFERENCES
저자
Chan-Young Choi [ School of Computing, Gachon University ]
Sung-Yoon Ahn [ School of Computing, Gachon University ]
Sang-Woong Lee [ School of Computing, Gachon University ]
Corresponding Author