Earticle

다운로드

Leveraging Visual Language Models for Information Extraction from Semi-Structured Business Documents

  • 간행물
    한국경영정보학회 정기 학술대회 바로가기
  • 권호(발행년)
    2025 한국겨영정보학회 추계학슬대회 (2025.10) 바로가기
  • 페이지
    pp.81-84
  • 저자
    Bongjin Sohn, Gunwoong Lee
  • 언어
    영어(ENG)
  • URL
    https://www.earticle.net/Article/A476014

원문정보

초록

영어
Modern enterprises maintain extensive repositories of business documents within their intranet systems, creating a critical need for automated processing capabilities of image-based documents to enhance operational efficiency. Unlike standardized forms, most business documents are semi-structured, with layouts and field positions varying widely across organizations and document types. This complexity has generated substantial demand for advanced information extraction and organization technologies, capable of handling irregular structures and diverse schemas. However, conventional Optical Character Recognition (OCR) approaches, which prioritize textual recognition, encounter significant limitations when processing complex forms due to their reliance on location-based extraction. Similarly, Key Information Extraction (KIE) techniques often require domain-specific pre-training, resulting in considerable learning and adapting costs for novel document formats. To address these challenges, this study proposes an innovative process for effectively extracting and organizing key elements from semi-structured documents by employing Visual Language Models (VLMs) that process documents as image inputs and concurrently analyze visual and linguistic information. The proposed framework determines superior extraction accuracy, economic efficiency, and even user satisfaction by exploiting both semantic textual content and spatial positioning as visual cues. Experimental results demonstrate that the VLM-based framework outperforms existing OCR and KIE solutions across multiple evaluation dimensions, while the integration of human-in-the-loop verification processes establishes a practical framework for semi-structured document automation (e.g., commercial invoice) with immediate applicability in fast-changing enterprise environments.

목차

Abstract
Extended Abstract
References

저자

  • Bongjin Sohn [ Korea University Business School, Information Systems ]
  • Gunwoong Lee [ Korea University Business School, Information Systems ]

참고문헌

자료제공 : 네이버학술정보

    간행물 정보

    • 간행물
      한국경영정보학회 정기 학술대회 [KMIS Conference]
    • 간기
      반년간
    • 수록기간
      1990~2025
    • 십진분류
      KDC 325 DDC 658