LLM의 효율적인 추론 가속을 위한 Context Filtering 기법 연구

정현석; 현선영; 최윤석; 이창은; 하영국

216.73.217.15

개인회원 가입

개인회원
기관회원

개인회원 로그인

개인회원 가입으로 더욱 편리하게 이용하세요. 개인회원 가입

아이디/비밀번호를 잊으셨나요? 아이디/비밀번호 찾기

기관회원 로그인

소속기관에서 검색되지 않는 기관은 무료원문다운이 불가능합니다. 개인회원 가입 후 유료구매를 하시거나 소속기관 도서관에 이용문의해 주세요.

Home

LLM의 효율적인 추론 가속을 위한 Context Filtering 기법 연구
A Study on Context Filtering for Efficient Inference Acceleration of LLM

발행기관

국제차세대융합기술학회 바로가기
간행물

차세대융합기술학회논문지 KCI 등재 바로가기
통권

제9권 11호 (2025.11)바로가기
페이지

pp.2823-2830
저자

정현석, 현선영, 최윤석, 이창은, 하영국
언어

한국어(KOR)
URL

https://www.earticle.net/Article/A476296

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

원문정보

초록

영어: Since the advent of ChatGPT, research on applying large language models(LLM) to various fields has been actively conducted. Typical methods for improving LLM's reasoning performance include Chain-of-Thought and Retrieval-Augmented Generation(RAG). However, these methods increase the length of the input prompt, thereby increasing the inference time and cost, and there is a problem that Hallucination occurs due to unnecessary context. In order to solve this problem, this paper proposes a Context Filtering technique that compresses context, leaving only relevant information based on the questionnaire among prompts. Context Filtering is a method for LLM to infer using only necessary information as input based on the time-series and semantic relevance of a query. The experiment was conducted based on time-series data of Timestamp + Triplet structure in a battlefield situational awareness scenario. As a result of the experiment, it was confirmed that the situation awareness accuracy similar to that of the existing prompt was maintained even at the compression prompt using only about 25% of the entire context, and the inference time and token usage were also reduced.

한국어: ChatGPT의 등장 이후 대규모 언어 모델(Large Language Model, LLM)을 다양한 분야에 적용하는 연구가 활발하게 이루어지고 있다. LLM의 추론 성능을 향상시키기 위한 대표적인 방법으로는 Chain-of-Thought, Retrieval-Augmented Generation(RAG) 등이 있다. 그러나 이러한 방법들은 입력 프롬프트(Prompt)의 길이를 증가 시켜 추론 시간과 비용을 높이며, 불필요한 문맥으로 인해 환각 현상이 발생하는 문제가 있다. 본 논문에서는 이러한 문제를 해결하기 위해, 프롬프트 중 질의문을 기준으로 관련된 정보만 남기고 문맥(Context)을 압축하는 문맥 필터링 (Context Filtering) 기법을 제안한다. 문맥 필터링은 질의문의 시계열적, 의미적 관련성을 기준으로 필요한 정보만 입력 으로 사용하여 LLM이 추론하는 방법이다. 실험은 전장 시뮬레이션으로부터 생성된 Timestamp + Triplet 구조의 대규 모 시계열 데이터를 기반으로 수행되었다. 실험 결과, 전체 문맥의 약 25%만을 사용한 압축 프롬프트에서도 기존 프롬프 트와 유사한 수준의 상황 인지 정확도를 유지하였으며, 추론 시간과 토큰 사용량 또한 감소하였음을 확인하였다.

요약
Abstract
Ⅰ. 서론
Ⅱ. 관련 연구
2.1 LLM
2.2 프롬프트 압축 기법
2.3 시계열 데이터 추론
Ⅲ. Context Filtering 기법
3.1 Context Filtering 개요
3.2 시간적 필터링(Temporal Filtering)
3.3 의미적 필터링(Semantic Filtering)
3.4 최종 프롬프트
Ⅳ. 실험 및 분석
4.1 데이터셋
4.2 실험
4.3 결과 및 분석
Ⅴ. 결론
REFERENCES

키워드

대규모 언어 모델 프롬프트 압축 문맥 필터링 검색 증강 생성 시계열 추론 LLM Prompt Compression Context Filtering RAG Time-series Inference

저자

정현석 [ Hyun-seok Chung | 스마트랩스 연구원 ]
현선영 [ Sun-young Hyun | 스마트랩스 연구원 ]
최윤석 [ Yoon-Seok Choi | 한국전자통신연구원 책임연구원 ]
이창은 [ Chang-eun Lee | 한국전자통신연구원 책임연구원 ]
하영국 [ Young-guk Ha | 건국대학교 컴퓨터공학부 교수 ] Corresponding Author

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

발행기관명

국제차세대융합기술학회 [International Next-generation Convergence technology Association]
설립연도
2017
분야
복합학>기술정책
소개
Ever since next generation convergence technology became one of the most important industries in the nation, computing professionals have encountered a growing number of challenges. Along with scholars and colleagues in related fields, they have gathered in avariety of forums and meetings over the last few decades to share their knowledge, experiences and the outcome of their research. These exchanges have led to the founding of the International Next-generation Convergence technology (INCA) on December 1, 2015. INCA was registered as an incorporated association under the Ministry of Information and Communications. The main purpose of the organization is to improve our society by achieving the highest capability possible in next generation convergence technology.

간행물

간행물명

차세대융합기술학회논문지 [The Journal of Next-generation Convergence Technology Association]
간기
월간
pISSN
2508-8270
수록기간
2017~2026
등재여부
KCI 등재
십진분류
KDC 506 DDC 606

이 권호 내 다른 논문 / 차세대융합기술학회논문지 제9권 11호

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

출처 : 네이버학술정보

0개의 논문이 장바구니에 담겼습니다.

페이지 저장

소속기관 조회

이용자님의 소속기관(단체)이 서비스에 가입되어 있는지 확인해 보십시오.
기관회원에 소속되어 있는 이용자는 원문을 무료로 이용할 수 있습니다.

상호: 주식회사 학술교육원 I 대표: 노방용 I 사업자등록번호: 122-81-88227 I 통신판매업신고번호: 제2008-인천부평-00176호 I 정보보호책임자: 이두영
주소: (21319)인천광역시 부평구 영성중로 50 미래타워 701호 I 전화: 0505-555-0740 I 팩스: 0505-555-0741 I 이메일: earticle@earticle.net

음성지원 및 돋보기 서비스

Earticle

LLM의 효율적인 추론 가속을 위한 Context Filtering 기법 연구
A Study on Context Filtering for Efficient Inference Acceleration of LLM

원문정보

초록

목차

키워드

저자

참고문헌

간행물 정보

발행기관

간행물

이 권호 내 다른 논문 / 차세대융합기술학회논문지 제9권 11호

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

Earticle

LLM의 효율적인 추론 가속을 위한 Context Filtering 기법 연구 A Study on Context Filtering for Efficient Inference Acceleration of LLM

원문정보

초록

목차

키워드

저자

참고문헌

간행물 정보

발행기관

간행물

이 권호 내 다른 논문 / 차세대융합기술학회논문지 제9권 11호

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

LLM의 효율적인 추론 가속을 위한 Context Filtering 기법 연구
A Study on Context Filtering for Efficient Inference Acceleration of LLM