Earticle

다운로드

Challenges in Implementing Vision Transformer as a Detection Transformer

원문정보

초록

영어
In recent object detection research, there has been a growing focus on Detection Transformers predicting bounding boxes directly. However, Detection Transformers face challenges such as slow convergence and difficulty in detecting small objects. We attribute these issues to the insufficient feature extraction capability of the backbone. Therefore, we employ the high-performing backbone, the Pyramid Pooling Transformer to detection Transformer. However, we observe a problem where, despite rapid initial convergence, the model fails to converge effectively after a certain point in training. We discuss the underlying causes of this issue in this study.

목차

Abstract
I. INTRODUCTION
II. RELATED WORKS
A. P2T
B. DETR
III. METHOD AND EXPERIEMTNS
A. Method
B. Dataset
C. Evaluation Metrics
D. Expreiments Result
IV. DISSCUSION
ACKNOWLEDGMENT
REFERENCES

저자

  • Chan-Young Choi [ School of Computing, Gachon University ]
  • Sung-Yoon Ahn [ School of Computing, Gachon University ]
  • Sang-Woong Lee [ School of Computing, Gachon University ] Corresponding Author

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      한국차세대컴퓨팅학회 학술대회
    • 간기
      반년간
    • 수록기간
      2021~2025
    • 십진분류
      KDC 566 DDC 004