년 - 년
Synthetic Dataset for Single-View Object Detection and Model Benchmarking
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 2025.12 pp.15-17
Object detection (OD) is a fundamental task in computer vision. However, progress is often hindered by limitations in existing datasets, including human annotation errors, reliance on manual annotation, missing annotations due to occlusion, and domain specificity. To address these challenges, this work proposes an automatically generated synthetic single-view dataset for OD. The dataset was generated in Unity by constructing a 3D virtual city with a single-camera surveillance system, providing diverse perspectives and calibrated viewpoints. Object metadata, including position and dimensions, was automatically extracted and projected into the 2D image plane to generate accurate bounding boxes. Annotations were normalized into YOLO format, with invalid boxes removed, resulting in a single-view dataset that is consistent, precise, and free from manual labeling errors, while still reflecting real-world challenges such as occlusion and object variation. Two versions of the dataset, original and refined, were created to evaluate the effect of bounding box quality on detection performance. An experimental evaluation using the YOLOv11 model demonstrated that the proposed dataset substantially improved detection performance, yielding notable gains in precision, recall, and mean average precision (mAP). These results underscore the importance of accurate dataset curation and highlight the potential of synthetic datasets to advance single-view OD in applications such as surveillance, autonomous systems, and robotics.
A GNN-Based Framework for Modeling City-to- City Population Movement in China
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 2025.12 pp.361-364
Management of Population movement among cities is important for many sectors, such as urban planning, emergency management, and traffic management, etc. The early approaches, including radiation and gravity, give the primary insights but struggle to capture the complex topological and directional nature of inter-city mobility networks. This paper presents a Graph Convolutional Network (GCN)-based framework for modeling and predicting population movement between Chinese cities using Baidu mobility data. Cities are represented as nodes in a directed graph, with weighted edges indicating monthly outbound flows. A multi-layer GCN learns node embeddings that encode both local and global spatial dependencies, enabling the prediction of continuous relationship scores that reflect the intensity of movement. Experimental results, MedAPE of 5.59 and MAPE of 19.36, as well as relationship scores from major cities such as Shanghai and Shijiazhuang, demonstrate that the model effectively identifies stable mobility corridors and evolving connections over time. Overall, the proposed approach provides interpretable insights into population mobility dynamics and supports data-driven decision-making in urban forecasting and regional policy design.
Quality of Localization: Bounding Box Precision in MS-COCO vs. MJ-COCO
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 2025.12 pp.11-14
High-quality annotations are crucial for accurate object detection, but widely used datasets like MS-COCO face issues such as missing objects, duplicate labels, and inaccurate bounding boxes. To overcome these problems, MJ-COCO was created through model-driven refinement, increasing annotations from 860,001 to 1,221,970 instances. This paper presents a comparative analysis of MS-COCO and MJCOCO, with a focus on the accuracy of bounding box measurements. We designed a human-in-the-loop evaluation framework with custom software that enables side-by-side visualization of annotations, allowing evaluators to classify outcomes as improved, worse, or ambiguous. We collectively evaluated 41,754 annotations through a human-in-the-loop verification process involving fifteen human evaluators. The results demonstrate that a total of 25,754 annotations were improved, 2,398 were worsened, and 13,623 were ambiguous, for a total quality score of 89.49%. These findings show that MJ-COCO considerably enhances annotation quality and precision over MS-COCO, making it a more consistent and accurate standard for advancing object detection studies. The dataset and software codes are publicly available on Kaggle: https://www.kaggle.com/datasets/mjcoco2025/mj-coco-2025.
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 2025.12 pp.189-192
Predicting accurate human migration patterns is crucial for effective urban planning. However, accurate human migration patterns prediction remains a challenging task. Existing methods, such as Graph Neural Network approaches, often overlook dynamic temporal variations and directional dependencies in large-scale migration data. To overcome this challenge, we propose MiGA-Net (Migration Graph Attention Network), a graph Neural network-based framework enhanced with an attention mechanism to capture complex spatiotemporal dependencies and highlight significant migration flows on the domestic and international level. We utilize two different datasets of the Shinan-gun, South Korea, for international and domestic regions. Experimental results show that the proposed MiGA-Net achieved superior performance over both datasets. The model achieved 0.0027 MAE for domestic flow and 0.0155 for international flow, demonstrating the effectiveness of the proposed framework.
Semi-Supervised Learning for Audio-Visual Anomaly Recognition
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2025 한국차세대컴퓨팅학회 춘계학술대회 2025.05 pp.231-232
Anomaly recognition in visual and audio data has gained increasing significance in computer vision, as it plays a crucial role in protecting human lives and property. In this work, we developed a semi-supervised multimodal framework for anomaly recognition that combines audio and visual data for better performance. The proposed framework employs a hybrid network consisting of a convolutional neural network, Bi-Directional Long Short-Term Memory, a multi-head attention module, and a fully connected layer for anomalous pattern recognition. We created a novel real-time visual-audio anomaly recognition dataset and evaluated our framework on it, achieving promising results.
Analyzing City-Level Population Movement in China with Graph Neural Networks
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2025 한국차세대컴퓨팅학회 춘계학술대회 2025.05 pp.70-71
Recently, a graph neural network has played a crucial role across various fields. In this paper, we designed a Graph Convolutional Network (GCN) to analyze population movement at the city level. It consists of four Graph Convolution (GC) layers, with each layer responsible for aggregating knowledge from its neighboring nodes and updating the feature representation for each city. We utilized population mobility data from China, which includes daily city-to-city movement data. GCN estimates the strength of relationships among all cities. Experimental results demonstrate that the proposed GCN achieves improved performance in estimating city-to-city migration flow relationships.
Learning Inter-City Migration Flow Centered on Shinan-gun Using Graph Neural Networks
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2025 한국차세대컴퓨팅학회 춘계학술대회 2025.05 pp.210-212
In recent years, the anticipation of human mobility flow has significant applications in various domains ranging from urban planning to public health. This study proposes a hybrid Graph Neural Network and Long Short Term- Memory network-based model for nationwide human mobility prediction, effectively capturing inter-urban movement patterns. We validate the feasibility and effectiveness of our model using the Korean internal-city mobility dataset, which captures real-world population movement patterns across various urban regions. Our experimental results accurately predict inter-city mobility, advancing urban planning, health, and transport.
Feature importance analysis for population projection
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 10th International Conference on Next Generation Computing 2024 2024.11 pp.95-97
The identification of key feature selection plays a significant role in accurate population projection, which is an essential aspect of demographic statistics. The goal of this paper is to investigate the importance of the different features in population projection by using four advanced feature analysis techniques i.e. Canonical Correlation Analysis (CCA), Linear Discriminant Analysis (LDA) Fast Independent Component Analysis (FICA) and Principal Component Analysis (PCA). This analysis is important to determine the major factors that affect population change. The identification and ranking of these predictors can enhance demographic forecasting and policy planning. We utilized Koran population data from the UN Population Division dataset and evaluated the above four methods. The experimental results reveal that LDA achieved the lowest performance in selecting the most appropriate features, while PCA is the most efficient in selecting an effective feature with the highest variance. These insights build up the knowledge of population change and refine the projection models.
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 10th International Conference on Next Generation Computing 2024 2024.11 pp.98-101
Recent advancements in data-driven methodologies have brought significant attention to the computational prediction of material properties. Traditional machine learning (ML) approaches have struggled to achieve high accuracy due to the complex relationships between a material's structure and its properties. To address this challenge, in this work, we present an ML framework for predicting the stability of silicon (Si) and Si-based alkaline metal alloys with reduced error. This emphasizes the model transferability to discover new silicon alloys with diverse electronic configurations and structures. We explore the effectiveness of two atomic structural descriptors including X-ray diffraction (XRD) and sine coulomb matrix (SCM). The dynamic ensemble learning (DEL) model is trained and evaluated using 750 Si alloys from the materials project database (MPD) and optimized via ensemble learning. The results indicate that the XRD descriptor with DEL performs most reliably for formation energy, total energy and packaging fraction prediction, showing the model robustness and transferability for ultimate efficient silicon anode’s material synthesis.
Polynomial Regression Modeling for Efficient Prediction of Battery Rate Capability
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 10th International Conference on Next Generation Computing 2024 2024.11 pp.78-81
The battery market is experiencing rapid growth due to advancements in technology and increased recycling efforts. Verifying the suitability of developed batteries through rate capability experiments, which measure capacity based on charging and discharging speeds, is essential but resource-intensive and time-consuming. This research proposes a method to predict battery rate capability using a polynomial regression model based on similar data groups, aiming to shorten these experiments. The research was conducted in two main stages, namely the construction of the dataset and the development of the predictive model. Data was collected from experimental graphs in existing literature and new experiments on Coin Cell batteries. Through preprocessing steps including deduplication, interpolation, and extrapolation, a comprehensive dataset was created. A combined Quadratic and Linear Piecewise Interpolation method was developed to handle missing data efficiently. In the model development stage, polynomial regression models were created for groups of similar battery data, allowing accurate predictions for partial rate capability experiments. Experimental results demonstrated high accuracy, significantly reducing the need for extensive testing. The proposed method offers substantial time and resource savings, enhancing the efficiency of the battery development process.
Active Learning for Anomaly Recognition : Leveraging Visual and Audio Data Fusion
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 10th International Conference on Next Generation Computing 2024 2024.11 pp.102-104
Recognizing anomalies in surveillance is crucial for public safety to identify events that deviate from normal patterns. Visual information is essential for effective anomaly recognition; however, audio data can enhance recognition accuracy by providing additional context. Despite this, existing systems only utilize visual information, overlooking the potential of audio modalities in anomaly recognition. This paper introduces a multi-modal framework for anomaly recognition through active learning, integrating audio and visual modalities to enhance anomaly prediction. The framework extracts features using a pretrained ResNet-50 convolutional neural network (CNN) model from the visual and audio data. The extracted features are then forwarded to the Bi-Directional Long Short-Term Memory (Bi-LSTM) network for temporal feature learning. These features are then fused and fed into a classification layer for final prediction. The proposed framework's performance is assessed on a benchmark dataset and yields promising results.
A Modified Vision Transformer-based Anomaly Recognition using Audio Data
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2024 한국차세대컴퓨팅학회 춘계학술대회 2024.04 pp.337-340
In recent years, anomaly recognition using audio has attracted the attention of the research community, due to the increasing number of abnormal situations day by day. In the past, researchers have mainly focused on video-based anomaly recognition. However, occlusion is one of the most important factors due to which the anomalous object is unidentifiable. Therefore, in this paper, we proposed a modified vision transformer that utilized the Shifted Patch Tokenization (SPT), and Local Self-Attention (LSA) mechanism and reduced the number of multilayer perceptrons in the head, enabling the model to capture rich spatial information within the spectrogram of anomalous data. The proposed model is implemented using the Sound Events for Surveillance Applications (SESA) dataset and obtained 87% testing accuracy. Thus, the proposed model is an efficient and effective solution for audio-based anomaly recognition.
Surveillance Abnormal Activity Recognition Using Residual Deep Bidirectional LSTM Network
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2024 한국차세대컴퓨팅학회 춘계학술대회 2024.04 pp.345-348
Nowadays, surveillance systems play a pivotal role in monitoring various sectors to ensure public safety and security. These systems generate massive amounts of video data. Therefore, effective analysis of these streams is an important research area with multiple applications. Several methods have been reported for the automatic recognition of abnormal activities, but these techniques show limited performance while learning complex temporal dependencies of real-world surveillance of abnormal activities. We introduce a Deep Learning (DL)-assisted framework that is mainly divided into two parts. First, the surveillance video stream is preprocessed, and then the BoTNeT-152 is employed to extract spatial features. Secondly, a Residual Deep Bidirectional Long Short-Term Memory (RBLSMT) Network is introduced to learn the complex temporal dependencies across multiple frames for abnormal activity recognition. To assess the effectiveness of our proposed method, we evaluated its performance on the benchmark real-world UCFCrime2Local dataset, achieving an accuracy of 86% reveals a significant improvement of up to 2% compared to existing methods which shows the superiority of the suggested technique in addressing the challenges posed by complex surveillance environments.
Survey of AI‑Empowered Methods for Detecting Electricity Theft in Smart Grids
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 9th International Conference on Next Generation Computing 2023 2023.12 pp.239-242
This survey explores electricity theft detection in smart grids, where traditional power systems meet modern technology. Smart grids, designed for efficient energy management and continuous integration of renewables, face a pressing challenge electricity theft, costing utility companies over $96 billion annually. The survey traces the evolution from conventional to smart grids, emphasizing their core components. It underscores the economic impact of theft, driving researchers to explore Artificial Intelligence (AI) and Deep Learning (DL) techniques for detection. A comprehensive literature review reveals various approaches, with a focus on DL's growing influence. Public datasets are explored as invaluable resources, and methods for theft detection, including advanced AI and DL, are dissected. Performance metrics like accuracy and precision are discussed, and challenges, including imbalanced data and privacy concerns, are highlighted. In conclusion, the survey emphasizes the need for diverse AI and DL approaches, data sources, and features to create robust theft detection systems for smart grids, ensuring their secure and efficient operation.
Dataset Standardization for Effective Solar Power Forecasting : A Comprehensive Analysis
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 9th International Conference on Next Generation Computing 2023 2023.12 pp.110-113
This paper introduces a comprehensive approach to dataset standardization aimed at enhancing the effectiveness and reliability of solar power forecasting models. Leveraging multiple datasets, this study incorporates additional attributes such as atmospheric pressure and sunshine duration. These enrichments bridge critical gaps in meteorological and environmental data, facilitating more robust and precise solar power forecasting. The paper underscores the significance of these attributes, furnishes detailed equations for their computation, and presents the outcomes of their integration. It underscores their pivotal role in enabling solar energy stakeholders to make informed decisions and optimize energy production effectively.
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 The 9th International Conference on Next Generation Computing 2023 2023.12 pp.292-295
Sustainable power systems should include solar energy generation. However, for effective grid management and the integration of renewable energy sources, accurate solar power generation predictions are essential. Therefore, this study compares the prediction of solar power forecasting in Italy and Bulgaria. These are two countries that have alike latitudes but different populations and solar energy production. The historical solar power generation and meteorological data from these countries are preprocessed and then used to apply four different deep learning models including Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). The results are analyzed to gain insights into how the proximity of geographical locations and the quality and quantity of data impact the precision of prediction algorithms.
어텐션 매커니즘 기반 심층 컨볼루션 뉴럴 네트워크를 사용한 산업용 불량 칩 검사
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2023 한국차세대컴퓨팅학회 춘계학술대회 2023.06 pp.51-54
The identification of anomalies in industrial settings poses a significant challenge, especially when there is a lack of negative samples and when the anomalous regions are small. Although existing computer vision methods have automated this task to some extent, these approaches struggle to extract salient features for inspecting defective chips. To tackle this problem, a deep learning-based framework is proposed for detecting anomalies in industrial settings. The framework utilizes a fine-tuned backbone convolutional neural network model and incorporates an enhanced attention mechanism. The attention module generates discriminative feature maps along two dimensions: channel and spatial. This is achieved by processing intermediate features obtained from the backbone model. These attention maps are then multiplied with the input feature map to dynamically enhance the relevant features. Extensive experiments demonstrate the effectiveness of our proposed method in maintaining a high level of detection accuracy for industrial product inspections. Consequently, our results conclude a suitable solution for optical chip inspection systems in industrial settings.
Fire detection is a significant attempt for preserving public safety in complex surveillance environments. Although advances in deep learning for fire detection, the task remains challenging due to the natural irregularity in fire images, including differences in lighting conditions, occlusions, and background complexity. To address these challenges, we present a novel framework for fire detection named fire channel attention network (FCAN), which is capable of differentiating challenging fire scenes. Our approach is motivated by the need to enhance the accuracy of fire detection by selectively emphasizing the most informative channels of the input image through a channel attention (CA). Furthermore, our model captures the salient features from the input image and suppresses the irrelevant ones, thereby overcoming the aforementioned challenges of fire detection. The FCAN is evaluated on two benchmark datasets and surpassed existing methods in terms of accuracy and F1 score. The proposed model demonstrates the effectiveness of fire detection, highlighting its potential for practical applications in fire safety and prevention.
건물의 전력 소비 예측을 위한 어텐션 기반 이중 스트림 딥러닝 네트워크를 활용한 개선된 전력 소비 예측
한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2023 한국차세대컴퓨팅학회 춘계학술대회 2023.06 pp.273-276
A crucial component of designing intelligent and ecologically friendly environments nowadays is electricity consumption forecasting. The generation of energy can be enhanced to effectively meet the population's rising requirements by using the prediction of future electricity consumption. Due to the broad variety of consumption patterns, it is difficult to anticipate the energy requirements of buildings. Therefore, this work uses a dual-steam approach with multi-head attention to anticipate the power consumption of the building to address this issue and produce precise predictions. The proposed network concurrently learns temporal representations through a Bidirectional Gated Recurrent Unit (BGRU) and spatial patterns through Atrous Convolutional Neural Network (ACNN). The obtained features are combined to create a single feature vector that is used as the input for the multi-head attention, which finds the features that are most suited to forecasting the electricity consumption of a building. Finally, the dense layer receives the effective features and uses them to forecast short-term power consumption. In this paper, the proposed dual-stream network with attention outperforms competing models, achieving the lowest error value for hourly building power consumption prediction, according to experimentation on the household electricity consumption dataset.
Accurate detection of small targets in aerial images is crucial but challenging due to the limited computational resources of UAVs. This paper presents an efficient approach based on YOLO-V5S for detecting and classifying distant vehicles in aerial scenes. Extensive ablation study is conducted to find the optimal YOLO architecture. The proposed method is efficient and effective, making it applicable for real-time deployment. A dataset of 1000 annotated images are developed to validate the proposed method's effectiveness. The proposed network outperforms existing state-of-the-art methods in accuracy, speed, and resource efficiency, making it a promising solution for aerial vision-based applications.
0개의 논문이 장바구니에 담겼습니다.
선택하신 파일을 압축중입니다.
잠시만 기다려 주십시오.