Earticle

현재 위치 Home

Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments

첫 페이지 보기
  • 발행기관
    보안공학연구지원센터(IJGDC) 바로가기
  • 간행물
    International Journal of Grid and Distributed Computing 바로가기
  • 통권
    Vol.6 No.5 (2013.10)바로가기
  • 페이지
    pp.71-82
  • 저자
    Tao Gu, Chuang Zuo, Qun Liao, Yulu Yang, Tao Li
  • 언어
    영어(ENG)
  • URL
    https://www.earticle.net/Article/A205419

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

원문정보

초록

영어
MapReduce is an effective programming model for large-scale data-intensive computing applications. Hadoop, an open-source implementation of MapReduce, has been widely used. The communication overhead from the big data sets’ transmission affects the performance of Hadoop greatly. In consideration of data locality, Hadoop schedules tasks to the nodes near the data locations preferentially to decrease data transmission overhead, which works well in homogeneous and dedicated MapReduce environments. However, due to practical considerations about cost and resource utilization, it is common to maintain heterogeneous clusters or share resources by multiple users. Unfortunately, it’s difficult to take advantage of data locality in these heterogeneous or shared environments. To improve the performance of MapReduce in heterogeneous or shared environments, a data prefetching mechanism is proposed in this paper, which can fetch the data to corresponding compute nodes in advance. It is proved that the proposal of this paper reduces data transmission overhead effectively with theoretical analysis. The mechanism is implemented and evaluated on Hadoop-1.0.4. Experiment results on real applications show that the data prefetching mechanism can reduce data transmission time by up to 94%.

목차

Abstract
 1. Introduction
 2. Background and Motivation
  2.1. MapReduce overview
  2.2 Problem statement and motivation
 3. Proposed Data Prefetching Mechanism
  3.1. Data prefetching's design
  3.2. Theoretical analysis
 4. Experimental Evaluation
  4.1. Performance improvement in heterogeneous environment
  4.2. Performance improvement in shared environment
 5. Related Work
 6. Conclusions and Future Work
 Acknowledgements
 References

키워드

MapReduce Hadoop Data Prefetching Data Transmission

저자

  • Tao Gu [ The College of Information Technical Science, Nankai University, Tianjin 300071, China ]
  • Chuang Zuo [ The College of Information Technical Science, Nankai University, Tianjin 300071, China ]
  • Qun Liao [ The College of Information Technical Science, Nankai University, Tianjin 300071, China ]
  • Yulu Yang [ The College of Information Technical Science, Nankai University, Tianjin 300071, China ]
  • Tao Li [ The College of Information Technical Science, Nankai University, Tianjin 300071, China ]

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

  • 발행기관명
    보안공학연구지원센터(IJGDC) [Science & Engineering Research Support Center, Republic of Korea(IJGDC)]
  • 설립연도
    2006
  • 분야
    공학>컴퓨터학
  • 소개
    1. 보안공학에 대한 각종 조사 및 연구 2. 보안공학에 대한 응용기술 연구 및 발표 3. 보안공학에 관한 각종 학술 발표회 및 전시회 개최 4. 보안공학 기술의 상호 협조 및 정보교환 5. 보안공학에 관한 표준화 사업 및 규격의 제정 6. 보안공학에 관한 산학연 협동의 증진 7. 국제적 학술 교류 및 기술 협력 8. 보안공학에 관한 논문지 발간 9. 기타 본 회 목적 달성에 필요한 사업

간행물

  • 간행물명
    International Journal of Grid and Distributed Computing
  • 간기
    격월간
  • pISSN
    2005-4262
  • 수록기간
    2008~2016
  • 십진분류
    KDC 505 DDC 605

이 권호 내 다른 논문 / International Journal of Grid and Distributed Computing Vol.6 No.5

    피인용수 : 0(자료제공 : 네이버학술정보)

    함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

      페이지 저장