한국차세대컴퓨팅학회 논문지 Vol.9 No.2::한국차세대컴퓨팅학회

논문

김명섭, 송표, 임철수, 조철회, 이현우, 허의남

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.6-15

모바일 씬 클라이언트 기술은 사용자가 어디서든지 모바일 기기를 통해 씬 클라이언트 서비스를 사용할 수 있게 하였다. 모바일 씬 클라이언트를 구축하기 위해선 1)서버 자원 할당의 비용문제와 2)클라이언트의 QoE(Quality of Experience, 클라이언트의 자원 소모량, 그래픽 품질, 반응 속도에 따른 사용자 만족도)라는 두 가지 이슈들이 논의되고 있다. 이 논문에서는 서버에서의 비용문제를 해결하기 위해 기존의 VM기반의 자원 할당이 아닌 크로스 레이어 다중 지원 기술과 클라이언트 QoE 보장을 위해 MJPEG과 RFB를 이용한 하이브리드 디스플레이 프로토콜 제안하며 기존의 기술들과 비교한 성능들을 거론한다.

The introduction of mobile Thin-Client made it possible for users to be provided with Thin-Client service to multiple devices without location constraints. The mobile device receives the display image of the application which is running on the server side. To design such mobile Thin-Client system, 1) cost efficiency of server side and 2) Quality of Service (resource utilization, graphic quality, response time) of client side are considered as two major issues. In this paper, we design the Cross Layer Multi-Support on the server side and Hybrid Display Protocol on the client side. Also we discuss the major achievements of the performance evaluation of our proposed system.

EPON-WiMAX 통합망에서 QoS를 지원하는 HOB 구조 및 대역폭 예약 프로토콜

정복래

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.16-23

EPON과 WiMAX 네트워크의 결합에 있어서, 서로 다른 특성을 가지는 EPON과 WiMAX MAC단의 정합은 종단 간 QoS를 보장하기 위해 중요하다. 이 문제를 해결하기 위해, 본 논문은 우선 하드웨어 구조적인 측면에서 이종망 간 QoS 연속성 기능을 지원하는 하이브리드 광종단장치-기지국 구조를 제안한다. 다음으로 프로토콜 측면에서, 종 단간 QoS 보장을 위해 WiMAX 기지국과 EPON 광종단장치 사이에서 상호 유기적인 대역폭 할당 및 QoS 매핑을 수행하는 QBRP 방식을 소개한다. 하이브리드 광종단장치-기지국 구조에서 동작하는 QBRP는 광종단장치와 기지 국간의 협업이 이루어지지 않는 기존의 MHTM 방식에 비해, 평균 지연 시간, 패킷 지연 편차 (지터), 패킷 손실 비율에서 우수한 성능을 보인다.

For integration of Ethernet Passive Optical Network (EPON) and WiMAX (IEEE802.16), it is necessary to support End-To-End (ETE) Quality-of-Service (QoS) due to the heterogeneous media access control (MAC) features of EPON and WiMAX. To address this issue, this paper first presents a Hybrid Optical Network Units-Base Station (HOB) architecture that provides a basic structure to support QoS related functions. On the basis of HOB, the QoS-enabled Bandwidth Reservation Protocol (QBRP) is proposed to achieve ETE QoS by performing the cooperative bandwidth allocation and QoS mapping. The simulation results show that the QBRP outperforms the existing Multi-Hop Transmission Mechanism (MHTM) in terms of the packet delay, the packet delay variation (jitter), and the packet loss ratio.

스마트폰의 증강현실을 활용한 광고의 속성이 광고태도와 브랜드태도, 구매의도에 미치는 영향

김중규

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.24-35

본 연구는 증강현실을 이용한 광고의 광고 속성이 소비자의 광고태도와 브랜드태도, 구매의도에 어떠한 영향을 미칠 수 있는가에 주목하였고, 연구의 각 개념이 서로에게 어떠한 영향을 미치게 되는지를 분명히 이해하기 위해 기술적 측면과 경영학적 측면에서 접근하였다. 기술적 측면에서는 증강현실과 증강현실을 활용한 마케팅 사례에 초점을 맞 추었으며, 경영학적 측면에서는 광고 속성, 광고태도, 브랜드태도, 구매의도에 초점을 맞추었다. 이러한 다양한 선 행 연구와 개념들을 근거로 하여 증강현실을 활용한 광고의 광고 속성이 광고태도와 브랜드태도, 구매의도에 어떠한 영향을 미치는지 정보성, 오락성, 개별성, 상호작용성을 바탕으로 설문하고 탐색적 요인분석 (exploratory factor analysis), 주성분분석(principal components analysis), 베리맥스(varimax)방식 등을 사용하였다. 분석 결과 증강현실 광고의 1) 오락성, 상호 작용성, 정보성 2) 광고 태도 3) 브랜드 태도 등이 구매의도에 유의한 정(+)의 영향을 미친다는 것을 알 수 있었다.

This study is based on theoretical knowledge of augmented reality and on many examples in which augmented reality technology applies to advertisement and it is directly/indirectly used to various consumers. The study focuses on whether the characteristics of advertisement used augmented reality can have an effect on consumers’ ad attitude, brand attitude and purchase Intention. The important preceding concept of this study can be seen from the technical aspect and from the management aspect. In a technical aspect, the study takes the focus on augmented reality itself and marketing examples used augmented reality. This is a desire to clearly describe what the fundamental technology of advertisement using new technology is. In the management aspect, the study focuses on ad attitude, brand attitude, and purchase intention. This is not only for making each concept clear but also for purpose to clearly understand the influence of them one another linked to our subject of the study. This study suggested conceptual study model to find out the effect of the characteristics of advertisement based on augmented reality on ad attitude, brand attitude and purchase intention, and conducted survey of different sex and aged consumers to verify the conceptual study model.

데이터센터 에너지 감시장치의 패키지 형태 개발과 전력효율성 향상을 위한 Building block 구조의 에너지 최적화기술

조창희, 김경남

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.36-43

데이터센터의 단계별(수배전반-UPS-분전반-RACK-IT장비) 전력사용실태를 정밀 계측,분석과 종량제 과금을 위한 Rack별/분기별 사용전력량을 측정이 가능하며 PUE Level3 측정지원과 서버실 공조에너지 사용량계측, 분석 및 최 적화 제어를 통하여 본 연구는 데이터센터의에너지 절감 10%를 달성하였다. 고성능 DATA 처리기술 소프트웨어 (SW)와 Building-Block 형태의 제품구조인 계측용 하드웨어(HW)를 연구개발하였고 본 논문은 데이터센터의 인 프라시설에 대한 설계방향을 제시하였고 또한 손쉬운 현장적용과 효율적인 에너지 관리시스템의 모델을 제시하였다.

In this paper we were able to achieve the energy reduction by 1) inspecting detailed electronic energy consumption by phase(Incoming panel-Distributing Board-UPS-panel board-Rack-IT Equipments), 2) measuring consumption of coefficient energy in server rooms and assisting measurement activities of PUE level3 by inspecting power consumption rate quarterly (or by rack) for the meter-rate system. We provide a solution to achieve 10 per cent reduction of energy consumption in a datacenter. We also shows easy implementation of the technology in building-block style by developing cutting edge data processing technology software and hardware measurement tools. the paper suggests a direction of designing infrastructure, its pragmatic application and establish efficient energy management system model.

GPGPU 응용의 고성능 원격수행을 위한 OpenCL 기반 오프로딩 프레임워크

박세진, 마정현, 박찬익

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.44-53

본 논문에서는 GPGPU 가 없거나 저성능 GPGPU를 가진 시스템에서도 고성능 GPGPU 응용을 수행할 수 있도 록, 로컬의 GPGPU응용을 오프로딩 하여 원격지에서 수행 시키는 프레임워크를 제안한다. 이 프레임워크는 응용으 로부터 호출된 API를 OpenCL 라이브러리를 대신하여 새롭게 정의된 원격 전송 라이브러리를 통해 원격지로 보내 고, 원격지에서는 전송 받은 API를 수행한 결과 값을 로컬로 전송한다. 이 결과 값은 로컬측 원격 전송 라이브러리 를 통해 응용에게 다시 전달되므로, 응용은 하위 라이브러리 계층에 투명하게 원격지 수행을 보장 받을 수 있다. 제 안하는 프레임워크 프로토타입은 OpenCL 버전 1.1을 지원하며, 원격 수행 시 기존 응용의 수정을 필요로 하지 않 는다. Matrix Multiplication의 다양한 사이즈를 이용한 실험결과 원격지 수행이 로컬 수행 보다 최대 4.9배 향상 됨을 볼 수 있었으며, LavaMD 실험의 경우 최대 3.2배 정도의 수행 성능 향상을 볼 수 있었음.

In this paper, we propose the framework that enables a local GPGPU application to run in remote site by offloading to support the system which does not have the GPGPU or has low performance GPGPU. This framework sends the invoked APIs from the application to the remote site using the newly defined remote transmission library instead of the original OpenCL library. In the remote site, the execution results of the received APIs are sent to the local machine. Since these results are transferred to the user application through the remote transmission library in the local site, the application is guaranteed the remote execution transparently. The prototype of this framework supports OpenCL version 1.1 and it does not require any modification of the application when it runs in the remote site. The experimental results for remote execution have shown that it performs maximum 4.9 times better than local execution using various Matrix Multiplication size and maximum 3.2 times better than local execution using the LavaMD workload.

워프 스케쥴링 기법에 따른 GPU 성능 분석

최홍준, 김철홍, 김종면

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.54-66

GPU에서 실제 연산을 담당하는 세이더코어는 다수의 워프를 동시에 할당받아 수행함으로써 연산자원 활용률을 극 대화한다. 세이더코어가 할당받은 다수의 워프들 중에서 어떠한 워프를 선택하여 수행하는지에 따라 GPU의 성능 은 달라질 것으로 예상된다. 효율적인 워프 스케쥴링 기법을 개발하기 위해서는 워프 스케쥴링 기법의 특성 분석이 선행되어야 한다. 본 논문에서는 워프 스케쥴링 기법에 따른 GPU의 성능을 분석하고자 한다. 무작위 스케쥴링, 라 운드로빈 스케쥴링, 그리고 선입선처리 스케쥴링 기법을 분석 대상으로 사용한다. 실험 결과에 따르면, 분기 명령어 를 포함하지 않는 응용프로그램을 수행하는 경우에는 스케쥴링 기법에 따른 성능 차이가 거의 없는 반면에, 분기 명 령어를 다수 포함하는 응용프로그램을 수행하는 경우에는 스케쥴링 기법에 따른 성능 차이가 상당히 발생함을 알 수 있다. 그 이유는 분기 명령어는 워프 스케쥴링 기법에 따라 무작위적인 메모리 접근으로 유발되는 병목현상을 완화 또는 악화시키기기 때문으로 분석된다. 본 논문의 분석 결과는 범용 응용프로그램을 수행하는 GPU를 위한 워프 스 케쥴링 기법을 개발하고자 하는 경우 가이드라인을 제시할 수 있을 것으로 기대된다.

Shader core can process multiple warps simultaneously, enabling the GPU to improve the utilization of computational resources. The performance of the shader core depends on the warp scheduling schemes which select the warp to execute among assigned multiple warps. Therefore, we analyze the GPU performance according to three warp scheduling schemes in this work: random scheduling, round robin scheduling, and first ready first come first service scheduling schemes. Experimental results show that the performance gap between simulated warp scheduling schemes is negligible when the applications without branch instructions are executed, while the performance gap increases when the applications including lots of branch instructions are executed. The performance gap is caused by branch instructions which have strong relation to memory bottleneck depending on the warp scheduling schemes.

멀티코어환경에서 지역성 향상 및 동적 로드 밸런싱을 통한 트리 검색 성능향상 및 예측기법

성운, 박준석

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.67-77

최근 멀티코어 프로세서가 널리 활용되고 있지만 프로그래머가 최적의 성능향상을 가지는 프로그램을 작성하기는 매 우 어렵다. OpenMP는 디렉티브의 삽입만으로 병렬화가 가능하지만, 최종 병렬 코드의 성능 향상은 쓰레드의 수와 쓰레드 간 작업의 분배에 크게 영향을 받는다. 본 논문은 쿼드 트리 기반 쿼리 처리 문제를 배열 기반의 트리로 변환 하여 지역성을 향상하고 자료구조의 특성에 맞게 병렬화를 진행한다. 병렬화 진행 시 동적 로드 밸런싱 기법을 통해 각 쓰레드에서 처리되는 데이터의 양을 균형적으로 할당함으로써 성능향상을 이루어내는 방법을 제안한다. 배열 기반의 쿼드 트리 데이터베이스쿼리 검색 프로그램에 동적 로드 밸런싱 기법과 병렬화를 적용한 결과를 이용 하여 쓰레드의 수와 프로그램 성능의 상관관계를 분석하고 수식화 한다. 쓰레드의 수에 따른 프로그램의 전체 수행 시간을 예상한 후 실제 수행시간과 비교할 경우 평균적으로 전체 수행시간에서 5~10%의 차이를 보인다. 이를 이용하여 최적의 성능향상을 보이는 쓰레드의 수를 예측 할 수 있다.

Recently multicore processors have been widely used, however it is still very difficult to achieve optimal performance improvement by parallel programming on multi core processors. OpenMP provides reliable parallel programming interfaces which enables parallelism by the insertion of directives; nevertheless the performance of final parallel code will be determined by the number of threads and the distribution of workload between threads. In this paper, we improve locality and parallelize quad-tree based query problems by transforming tree into array-based tree. In the process of parallelization, we propose dynamic load balancing method which enables load balance between multiple threads to implement parallelized query process on tree. The experimental results shows that proposed methods successfully parallelize the tree traversal and dynamically balance the load between threads. We also analyze experimental results to correlate number of threads and performance improvement to establish performance estimation formula. The estimated execution time of program using number of threads and the actual execution time have 5-10% gap on average. We can predict the number of threads that is optimal performance improvement using the result.

k-오차문제를 위한 4-러시안 알고리즘의 계산 단계 병렬화

김영호, 김진욱, 심정섭

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.78-88

근사문자열매칭문제는 오랫동안 활발히 연구되어 왔다. 대표적인 근사문자열매칭문제로 편집거리문제와 k-오차문 제가 잘 알려져 있다. 알파벳 Σ에 대해 길이가 각각 m, n인 두 문자열 P와 T가 주어졌을 때, 편집거리문제는 P와 T사이의 편집거리를 계산하는 문제이며, k-오차문제는 T내에서 P가 k이내의 편집거리로 발생하는 모든 위치를 찾는 문제이다. 편집거리문제와 k-오차문제는 동적프로그래밍을 이용하여 O(mn) 시간과 공간을 이용하여 해결할 수 있으며, 접미사트리를 이용하면 k-오차문제는 O(kn) 시간과 공간을 이용하여 해결할 수 있다. 편집 거리 문제는 4-러시안 알고리즘을 이용해서도 해결할 수 있다. 4-러시안 알고리즘은 블록 크기를 t라 할 때, 전처리 단 계에서 O((3lΣl)2tt2) 시간과 O((3lΣl)2tt) 공간, 계산 단계에서 O(mm/t) 시간과 O(mn) 공간을 이용하여 편집거 리를 계산한다. 본 논문에서는 4-러시안 알고리즘의 계산 단계를 m/t 개의 쓰레드를 사용하여 O(m+n) 시간에 수 행하도록 병렬화하여 k-오차문제를 해결하는 알고리즘을 제시한다. 실험결과, 제시된 4-러시안 병렬알고리즘은 기존의 4-러시안 순차알고리즘보다 DNA알파벳(lΣl = 4)의 경우 t = 2일 때 약 41배, t = 4일 때 약 19배 빠른 결과를 보였고, 이진알파벳(lΣl = 2)의 경우 t = 2일 때 약 45배, t = 5일 때 약 15배 빠른 결과를 보였다.

Approximate string matching problems have been studied extensively for a long time. Two of the most well-known approximate string matching problems are the edit distance problem and the k -difference problem. Given two strings P(lPl=m) and T(lTl=n) over an alphabet Σ , the edit distance problem is to compute the edit distance between P and T . The k-difference problem is to find all the occurrences of P in T with at most k edit distances. The edit distance problem and the k-difference problem can be solved in O(mn) time using dynamic programming technique. If suffix trees are used, the k-difference problem can be solved in O(kn) time and space. The edit distance problem also can be solved using Four-Russians’ algorithm whose preprocessing step runs in O((3lΣl)2tt2) time and O((3lΣl)2tt) space and the computation step runs in O(mn/t) time and O(mn) space where t is the size of the block. In this paper, we propose a parallelized version of the computation step of the Four-Russians’ algorithm for the k-difference problem which runs in O(m+n) time using m/t threads. For DNA alphabet, experimental results show that our parallel algorithm is about 41 times and 19 times faster than the (sequential) Four-Russians’ algorithm when t=2 and t=4, respectively. For binary alphabet, our parallel algorithm is about 45 times and 15 times faster than the (sequential) Four-Russians’ algorithm when t=2 and t=5, respectively.

학회소식

International Conference on ICT for Smart Society 2013 소식 외

한국차세대컴퓨팅학회

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.9 No.2 2013.04 pp.89-97

Earticle

Issues

한국차세대컴퓨팅학회 논문지 [THE JOURNAL OF KOREAN INSTITUTE OF NEXT GENERATION COMPUTING]

논문

학회소식