한국차세대컴퓨팅학회 논문지 Vol.20 No.3::한국차세대컴퓨팅학회

커넥티드 카–분산 엣지 환경에서 실시간 대용량 데이터 처리를 위한 엣지 노드간 웹소켓 통신 기법 및 디지털 트윈 시뮬레이션

이충렬, 신병석, 이연

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.1-11

커넥티드 카에서 발생하는 실시간 데이터 처리를 위한 엣지–커넥티드 카 환경에서 웹소켓 통신 연구가 활발히 진행 되고 있다. 그러나 웹소켓은 연결 유지, 고정된 연결의 특성으로 대용량 데이터 처리를 위한 분산 엣지 환경을 구성 하는데, 구조적 한계가 있다. 기존 연구는 세션 공유, 아파치 카프카 통신을 활용했으나, 이는 추가 리소스와 종속 성, 확장성 문제가 존재한다. 본 연구는 실시간성, 확장성, 낮은 리소스를 보장하는 새로운 웹소켓 기반 데이터 통신 기법을 제안하며. 이를 Digital Twin 모델에 적용하였고 차량 충돌 이벤트 처리를 통해 제안된 기법의 효율성을 검증했다.

Research on WebSocket communication in edge-connected car environments is ongoing. WebSockets have structural limits for large data handling due to persistent and fixed connections. Previous approaches using session sharing and Apache Kafka faced scalability and resource dependency issues. This study proposes a new WebSocket-based technique that ensures real-time, scalable, and resource-efficient data communication. It has been applied to the Digital Twin model, proving its effectiveness through vehicle collision event handling.

MediaPipe로 추출한 신체 Landmark 및 IMU를 이용한 멀티모달 인간행동인식 딥러닝 모델 연구

김예은, 최재용

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.12-21

딥러닝을 활용한 인간 행동 인식(Human Activity Recognition)에 관한 연구는 활발하게 진행되고 있다. 다양한 인공지능 기법을 활용하여 영상 속 신체의 랜드마크를 검출하여, 사람의 자세를 추정하여 인간 행동 인식을 예측한 다. 하지만 카메라 위치나 밝기 등 다양한 외란에 의해 영상 정보만을 활용한 인간 행동 인식 예측은 정확도가 낮아 지는 문제점이 있다. 따라서, 본 연구에서는 인간 행동 인식률을 높이기 위해 RGB-D 카메라 기반의 Landmark 추출과 IMU를 통합한 멀티모달 시스템을 제안한다. IMU 센서와 RGB-D 카메라를 활용하여, 각기 다른 10종류의 동작을 수집하였다. IMU 센서에서 3축 가속도 데이터를 추출하고, Mediapipe 인공지능 프레임워크를 이용하여 영상 데이터에서 프레임 단위로 신체에서 랜드마크 33개의 3축 위치 좌표 데이터를 추출하여 동작 데이터를 시계열 데이터로 통일하였다. 영상 데이터와 IMU 센서의 가속도 데이터를 하나의 시계열 데이터로 통합하고, 이 과정에서 슬라이딩 윈도우 및 선형 증강 등의 기법을 활용하여 데이터의 양을 증강하였다. 이를 본 연구에서 새롭게 제안한 1D-CNN 및 LSTM 신경망을 통해 학습을 진행하였다. 또한 동작의 특성에 따라 신경망의 학습에 사용한 랜드마 크의 개수를 제한하거나, IMU 데이터 및 랜드마크 데이터와 통합 데이터를 학습할 때 각각 분류기의 성능을 확인 하였다. 그 결과, 랜드마크와 IMU 센서 데이터를 통합한 동작 데이터를 학습시켰을 때, 분류 인공신경망에서 향상 된 분류 성능을 내는 것을 확인하였다.

The rapid development of AI-based computer vision and deep learning technologies have been proven invaluable for Human Activity Recognition (HAR). However, under the various environmental conditions such as light-interference, image occlusion, the prediction performance of Human Activity Recognition can be decreased dramatically. This paper propose an RGB-D based Landmark detection with IMU sensor systems for multi-modal deep learning analysis to increase the performance of HAR. 10 different types of human motions are collected using the proposed RGB-D with IMU sensor system. The 3-axis accelerometer data are extracted from IMU sensor. And by applying AI framework, called MediaPipe, the 3-axis location data of 33 landmarks of the body extracted as the input datasets for the HAR classification. The extracted data and the motion data are unified into time series sequential data. In order to increase the training dataset for the HAR classification, the estimated pose data from MediaPipe and the acceleration data from the IMU sensor are not only integrated into the same size of the sequential data, but also sliding window and linear interpolation are used. Furthermore, 1D-CNN and LSTM neural network are implemented for the HAR classification. As a result, the proposed RGB-D with IMU sensor system has increased the classification performance for Human Activity Recognition than applying the features obtained from the camera only.

YOLO 성능 향상을 위한 데이터 증강기법

이준기, 장민호, 황영배

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.22-35

컴퓨터 비전은 CNN, 트랜스포머 등과 같은 모델의 발전으로 여러 분야에서 좋은 성과를 이루었다. 하지만, 모델을 학습하기 위해서는 다양하고 많은 데이터가 필요하다. 이러한 학습데이터를 얻기 위해서는 많은 시간과 노력이 필요 로 한다. 이러한 높은 비용으로 인해 데이터 부족이나 데이터 불균형이 발생하게 된다. 데이터 증강기법은 이러한 문 제를 해결하기 위한 좋은 방법이다. 본 논문에서는 객체 인식 모델을 위한 데이터 증강기법 중에서(복사-붙여넣기) Copy-Paste를 활용한 데이터 증강기법을 연구한다. 이전 연구에서는 인스턴스 영상 분할 객체를 붙이거나 시각적 인 맥락을 바탕으로 객체를 붙인다. 하지만 인스턴스 영상 분할 객체를 사용하지 않고 단순한 방법인 바운딩 박스 (Bounding Box)를 그대로 기존의 객체 위치에 같은 크기로 붙이거나 무작위로 붙이는 것도 모델의 성능이 향상된 다는 것을 발견했다. 또한, 객체에서 SAM(Segment Anything Model) 모델을 활용하여 객체의 인스턴스를 추출 하여 붙이는 방법을 제안한다. 그리고 붙이는 객체에 데이터 증강기법을 적용하여 데이터를 증강하는 방법을 추가실 험으로 보여준다. 또한, 기존의 객체가 붙여지는 객체에 의해 가려지는 것을 막기 위해 객체를 붙이고 기존 이미지에 있는 객체를 덮어쓴 방법도 적용하였다. 본 논문에서 객체 인식 모델 Yolo v5를 Pascal VOC12 데이터셋으로만 학 습한 결과보다 제안한 데이터 증강기법을 활용해서 학습한 결과가 더 높은 성능을 보여주는 것을 확인하였다.

Computer vision has shown excellent performance in various fields, thanks to the advancements in models like CNN and Transformers. However, training these models requires diverse and abundant data, which demands a significant amount of time and effort. The high cost associated with acquiring such training data often leads to issues like data scarcity and data imbalance. Data augmentation techniques provide effective solutions to address these challenges. In this paper, we focus on researching data augmentation techniques for object recognition models, specifically leveraging the Copy-Paste(Augmentation) technique. The previous researches involved attaching objects based on instance segmentation or visual context. However, we have discovered that using a straightforward approach, such as attaching bounding boxes of the same size to the existing object locations or randomly attaching objects, enhances the model's performance significantly. Furthermore, we propose a method of using the SAM(Segment Anything Model) to extract object instances from images and attaching them. We demonstrate additional experiments applying data augmentation techniques to the attached objects. To prevent existing objects in the image from occluded by the attached objects, we present a method of overlaying them into the image with attached objects. In this paper, we train the object recognition model using YOLO(You Only Look Once) v5 on the Pascal VOC12 dataset, and show better performance when utilizing the proposed data augmentation techniques.

DNN을 위한 비트 단위 파라미터 조작 프레임워크 및 파라미터와 정확도 간의 상호 연관성 분석

이동인, 김정헌, 임승호

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.36-46

최근 DNN이 다양한 산업에 확산되면서 IoT 기기 및 엣지 컴퓨팅에 적합한 경량 모델에 관한 연구가 급증하고 있 다. 본 논문에서는 기존에 없던 딥러닝 모델의 파라미터를 1 비트 단위로 조작할 수 있는 자동화 프레임워크를 개발 하며 파라미터 비트와 모델 정확도 사이의 관계를 실험 및 연구한다. 본 연구는 제안된 프레임워크를 사용하여 ImageNet 데이터셋으로 사전 학습된 DNN 모델 중 CNN 모델들의 파라미터를 하위 n-bit를 0, 1 또는 랜덤한 값으로 치환하는 3가지 방법을 통해 각각 정보 손실 발생시키면서 파라미터와 정확도 간의 강인성을 비트 단위로 실험하였다. 주요 모델로는 InceptionV3, InceptionResnetV2, ResNet50, Xception, DenseNet121, Mobile NetV1, MobileNetV2 을 사용하였다. 실험 결과, 성능이 낮은 모델일수록 하위 비트의 정보 손실에 민감하여 성 능이 좋은 모델보다 정확도를 유지하는 비트 수가 적다는 것을 실험적으로 확인했고, 파라미터와 정확도 간의 강인 성이 높다는 것을 확인하였다. 이러한 실험을 바탕으로 모델별 유효 파라미터 비트를 설정하여 파라미터를 줄이며 정확도를 유지할 수 있다.

Recently, with the proliferation of DNNs in various industries, there has been a surge in research on lightweight models suitable for IoT devices and edge computing. In this paper, we propose an automated framework that enables manipulation of deep learning model parameters at a 1-bit level, a capability not previously available. We investigate the relationship between parameter bits and model accuracy. Using the developed framework, we systematically experimented with the parameters of CNN models pre-trained on the ImageNet dataset by setting the lower n-bit to 0, 1, or a random value while each method inducing distinct information loss. The primary models evaluated include InceptionV3, InceptionResnetV2, ResNet50, Xception, DenseNet121, MobileNetV1, and MobileNetV2. Experimental results show that models with lower performance are more sensitive to information loss in the lower bits, requiring fewer bits to maintain accuracy compared to high-performing models. This concludes a high robustness between parameters and accuracy.

SERN 기반 운전자의 다중 행동 특징을 이용한 졸음 검출 시스템

김민준, 김원열, 최규호

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.47-55

최근 교통사고의 주요한 원인 중 하나인 운전자 졸음으로 인한 교통사고를 예방하기 위해 졸음 인식 연구가 활발히 진행되는 중이다. 기존 졸음 인식 시스템은 운전자의 신체적 특징을 이용하여 졸음 상태를 인식하지만 신체 부위 폐 색에 의한 가려짐으로 제한되는 한계가 있다. 본 논문에서는 운전자의 다중 행동 특징을 이용한 SERN(Squeeze and Excitation Resnet Network) 기반 졸음 인식 시스템을 제안한다. 제안한 시스템은 다중 행동적 특징 기반 특징 추출 과정, 데이터의 계층적 레이블링 세분화 과정, SERN 모델에 의한 졸음 인식 과정으로 구성된다. 공개 DB인 NTHU-DDD를 사용한 실험 결과, 제안하는 SERN 모델 기반 운전자 졸음 검출 성능이 기존 네트워크 모델 보다 정확도 1.03% 우수함을 확인했다.

Recently, driver drowsiness has been one of the major causes of traffic accidents, and study on drowsiness detection has been actively conducted to prevent drowsiness-related accidents. Existing drowsiness detection systems recognize the drowsy state of the driver using the driver's physical features, but they have limitations due to occlusion caused by obstructed body parts. In this paper, we propose a drowsiness detection system based on the squeeze and excitation resnet network (SERN) using the driver's multi-behavioral features. The proposed system consists of a multibehavioral feature extraction process, a hierarchical data labeling refinement process, and a drowsiness detection process using the SERN model. As a result of an experiment using public DB’s NTHU-DDD, it was confirmed that the proposed SERN model based driver drowsiness detection performance was 1.03% better than the existing network model.

MR 혈관조영술 3차원 영상에서 뇌동맥류 부위 혈관의 기하학적 특성의 정량적 분석을 위한 소프트웨어 설계

오정민, 김윤철

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.56-65

뇌혈관의 기하학적 구조가 뇌동맥류 발생과 연관이 있다는 연구가 보고되고 있다. 뇌혈관의 기하학적 구조로부터 뇌 동맥류의 위험도를 평가하려는 시도가 있지만, 2차원 X-ray 혈관조영영상에 대한 분석 연구가 주로 진행되어 왔다. 3차원 뇌혈관 영상의 분절별 분석을 가능하게 하는 공개된 소프트웨어가 없는 관계로, 본 논문에서는 3차원 자기공 명 뇌혈관 영상으로부터 뇌동맥류의 위험도를 평가하기 위한 소프트웨어 설계를 제안한다. 파이썬 환경에서 PyQt 를 이용하여 그래픽 사용자 인터페이스를 개발하고, Plotly를 이용하여 3차원 혈관의 중심선상의 점을 마킹하는 식 으로 하여 분기점을 포함한 혈관의 주요 지점들을 나타내어 분석에 활용한다. 분기점을 기준으로 두 혈관이 이루는 각을 측정하여 기존의 ImageJ와 값을 비교한다. 내경동맥(ICA, internal carotid artery)의 중간 지점에서 발생 한 뇌동맥류에 대해 혈관의 구불거림과 관련된 특징점을 추출하고 이를 뇌동맥류가 없는 환자데이터와 비교한다. 제 안하는 소프트웨어는 뇌동맥류 연구에 있어서 개별 환자 뇌혈관의 기하학적 특성을 정량적으로 분석하는데 유용할 수 있을 것이다.

The geometrical structure of cerebral blood vessels is reported to be correlated with the occurrence of cerebral aneurysms. Although there have been efforts to evaluate the risks of cerebral aneurysms from the morphological information of cerebral vasculature, most of the studies have focused on the analysis of 2D images obtained from X-ray angiography. Since custom software is not publicly available, in this paper we propose the design of software for quantitative assessment of the potential risks of cerebral aneurysms from 3D brain magnetic resonance angiography data. A graphical user interface was developed in Python using the PyQt library. The Plotly library was used for user-interactive annotations of the points of interest along the centerlines of the arteries, including bifurcations of the cerebral arteries. We compare the bifurcation angle measurements between our proposed software and ImageJ. We also assess any potential risk factors associated with aneurysm by comparing the tortuosity related features of the internal carotid arteries (ICAs) between patients with and without ICA aneurysms. The proposed tool could be useful for quantitatively analyzing the morphology of the cerebral arteries and estimating a risk for cerebral aneurysm in a patient-specific manner.

이동 멀티-홉 IoT 네트워크에서 가상인프라를 이용한 중복 경로 기반 효율적인 데이터 전송 제어

노원종

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.66-75

본 논문에서는 대규모 이동 애드혹 네트워크를 위한 새로운 전송 방식인 SCORE(synchronization and core node based redundant routing)를 제안한다. 첫째, core 노드, 동기화 및 가상 인프라의 개념을 제안하며, 둘 째, 최적의 코어 노드 경로 설정을 위하여, 정수 계획법(Integer Programming, IP) 문제를 설계하고 라그랑지안 완화법을 사용하여 해결한다. 셋째, 노드의 이동 동기화를 기반으로 코어 노드간 안정적인 로컬 중복 경로를 설정하 는 방법을 제시한다. 마지막으로, 시뮬레이션을 통해, 제안하는 전송 방식이 평균 처리량, 종단 간 지연 및 경로 재 설정 횟수 측면에서 기존 전송 제어 방식보다 우수한 성능을 나타냄을 확인하였다.

In this paper, we propose a SCORE(synchronization and core node based redundant routing), a new routing method for large-scale mobile ad hoc networks. First, we propose the concepts of core node, synchronization, and virtual infrastructure. Second, to set the optimal core node path, we design an Integer Programming(IP) problem and solve it using Lagrangian relaxation. Third, we present a method to establish stable local redundant path between core nodes based on node movement synchronization. Finally, through simulation, it was confirmed that the proposed transmission method outperforms the existing transmission control method in terms of average throughput, end-to-end delay, and the number of re-routing request.

NeRF를 이용한 실시간 볼륨 변형체 가시화

구현우, 이은우, 김수빈, 조세홍, 계희원

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.76-87

본 연구는 의료 시술 후의 경과를 빠르고 간단하게 예측할 수 있는 가상 의료 계획 방법을 제안한다. 제안 방법은 촬영된 동영상으로부터 NeRF를 이용하여 볼륨데이터를 생성하고, 공간 분리를 통해 좌표계를 실공간과 가상공간 으로 분리한다. 변형체 가시화는 가상공간에서의 역변환을 이용하여 수행한다. 또한, 사용자의 입력을 통해 변형하 고자 하는 영역의 크기와 변형의 세기를 설정하여 다양한 교정 부위의 크기에 대해서 자연스러운 변형이 이루어지도 록 한다. 변형은 변형 영역의 중심으로부터 멀어질수록 변형 세기가 감쇠되도록 좌표들 사이의 맨해튼 거리 (manhattan distance)를 변형 세기의 가중치로 사용한다. 마지막으로, 그래픽스 처리장치(GPU)를 이용한 병렬 화를 통해 대화적 시간으로 변형이 가능하다.

This research proposes a virtual medical planning method that can quickly and simply predict the outcomes following medical procedures. The method generates volume data from recorded videos using Neural Radiance Fields (NeRF) and separates the coordinate system into real and virtual spaces through spatial separation. Visualization of deformations is performed using inverse transformations in the virtual space. Moreover, users can set the size and intensity of the desired deformations based on their input parameters, ensuring proper deformations for various sizes of correction areas. The intensity of the deformation diminishes as it moves away from the center of the deformation area, using the Manhattan distance between coordinates as a weight for the deformation intensity. Lastly, transformations can be performed in interactive time through parallel processing using a GPU.

볼륨데이터에 최적화된 피하 산란 기법을 이용한 의료 영상 표현

정예은, 윤상연, 신병석

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.88-99

의료 분야에서는 인체 조직을 정밀하게 분석해야 하므로, 높은 영상 품질이 요구된다. 이를 위해 전역 조명 효과를 활용하여 볼륨 데이터를 시각화함으로써 인체 조직의 재질을 보다 사실적으로 표현한다. 인체 대부분의 장기들이 반 투명성을 가지고 있기 때문에, 피하 산란 기법을 적용하는 것이 좋다. 메시 기반 렌더링에서 사용하는 알고리즘을 볼륨 데이터에 그대로 적용하는 것은 어려워, 볼륨 데이터에 최적화된 피하 산란 표현 방법이 필요하다. 기하 모델 처럼 명시적인 표면이 정의되지 않는 볼륨 렌더링에서는 표면 곡률과 두께를 계산하기가 어렵다. 본 논문에서는 볼 륨 데이터에 최적화된 곡률 및 두께 추정 방법을 제안한다. 복셀 간의 위치와 법선 벡터 차이를 통해 곡률을 추정하 여 산란 효과를 더 정밀하게 모델링하였다. 또한, 두께를 추정하기 위해 복셀의 밀도 값을 누적하여 얇은 인체 조직 의 반투명성을 효율적으로 표현함으로써, 피하 산란 현상까지 고려한 사실적인 영상 생성이 가능하다.

In the medical field, precise analysis of human organs necessitates high-quality imaging. To achieve this, we utilize global illumination effects to visualize volume data, thereby representing the materials of human organs more realistically. Since most human organs exhibit translucency, it is advantageous to apply subsurface scattering techniques. Algorithms used in mesh-based rendering are challenging to apply directly to volume data, necessitating subsurface scattering methods optimized for volume data. In volume rendering, where explicit surfaces like geometric models are not defined, calculating surface curvature and thickness is difficult. This paper proposes an optimized method for estimating curvature and thickness in volume data. By estimating curvature through the differences in positions and normal vectors between voxels, we model scattering effects more accurately. Additionally, to estimate thickness, we accumulate voxel density values, effectively representing the translucency of thin human organs. This results in the creation of realistic images that account for subsurface scattering phenomena.

Earticle

Issues

한국차세대컴퓨팅학회 논문지 [THE JOURNAL OF KOREAN INSTITUTE OF NEXT GENERATION COMPUTING]