Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute...

37
Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31

Transcript of Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute...

Page 1: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

Performance and Energy Efficiency Evaluation of Big Data Systems

Performance and Energy Efficiency Evaluation of Big Data Systems

Presented by Yingjie ShiInstitute of Computing Technology, CAS

2013-10-31

Page 2: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Goals of Big Data SystemsGoals of Big Data Systems

Larger

GreenerFaster

Page 3: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Performance V.S. Energy EfficiencyPerformance V.S. Energy Efficiency

Performance

Energy EfficiencyFaster & More

PowerfulGreener &Cheaper

More servers Bigger clusters Powerful processors Sophisticated

processing algorithms

Lightweight servers Efficient processors Simpler processing

algorithms …

Tradeoff

Evaluation

Page 4: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Evaluation of Performance & Energy Efficiency Tradeoff

Evaluation of Performance & Energy Efficiency Tradeoff

How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Big Data Systems

How to get balance?The Implications from Benchmarking Three Big Data Systems

Page 5: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

MotivationMotivation

If you can not measure it, you can not improve it. – Lord Kelvin

PUE(Power usage effectiveness): a measure of how efficiently a computer data center uses its power; specifically, how much of the power is actually used by the information technology equipment.

Page 6: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

PUE & Its Variants PUE & Its Variants

Metric Time Organization Computing Formulas

PUE 2007

GreenGrid

DCiE 2008 GreenGrid DCeP 2008 GreenGrid pPUE 2012 GreenGrid PUE

Scalability2013 GreenGrid

Total Facility Energy

IT Equipment Energy

*100%IT Equipment Energy

Total Facility Energy

Total Facility Energy insidetheBoundary

IT Equipment Energy insidetheBoundary

*100%Actual

PUE

m

m

Pr

Quantityof ResourceConsumed Producing this Work

UsefulWork oduced

Total

Page 7: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

MotivationMotivation

• Scenario1

Data Management Researcher

An Improved Data Classification AlgorithmDoes it contribute to greening the data centers?

Run the Algorithms on Data Center

Compare the PUEs

No Obvious Variations!

PUE can not measure the effectiveness of any changes made upon the data center infrastructure!

Page 8: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

MotivationMotivation

• Scenario2

Data Center Administrators

Give a budget plan of the data center energyconsumption in the next year

Estimate the data volume based on the business development

How to estimate the energy increasement?

PUE provides little reference information for data center planning according to data scale

and application complexity

Page 9: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Calculation FrameworkCalculation Framework

PUE

AxPUE

Page 10: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Definition - ApPUEDefinition - ApPUE• ApPUE (Application Performance Power Usage Effectiveness): a

metric that measures the power usage effectiveness of IT

equipments, specifically, how much of the power entering IT

equipments is used to improve the application performance.

• Computation Formulas:

ApplicationPerformanceApPUE

IT Equipment Power

Data processing performance of applications

The average rate of IT Equipment Energy consumed

Page 11: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Definition - AoPUEDefinition - AoPUE• AoPUE (Application Overall Power Usage Effectiveness ): a metric

that measures the power usage effectiveness of the overall data center system, specifically, how much of the total facility power is used to improve the application performance.

• Computation Formulas:

ApplicationPerformanceAoPUE

Total Facility Power

The average rate of Total Facility Energy UsedApPUEAoPUE

PUE

Page 12: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Acquisition – Application PerformanceAcquisition – Application Performance

Application Category

Examples Metric

Service Application Search engine, Ad-hoc queries

Number of requests answered in unit time

Data Analysis Application

Data mining, Reporting, Decision support, Log analysis

Volume of data processed in unit time

Interactive Real-time Application

E-commerce, Profile data management

Number of transactions completed in unit time

High Performance Computing

Scientific Computing Number of floating-point operations in unit time

Page 13: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Acquisition – BenchmarkAcquisition – Benchmark

• Requirements of Benchmarks– Provide representative workloads for big data

applications

– Provide a scalable data generation tool

• BigDataBench– A big data benchmark suite open-sourced recently

and publicly available

– All the requirements are well fullfilled

Page 14: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Experiment OverviewExperiment Overview

• Testbed– Data center of 18 racks,362 servers– Sample 8 servers

• Workloads

• Two experiments– Different Applications– Different Implementation Algorithms

Page 15: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Experiments on Different ApplicationsExperiments on Different Applications

0

1

2

3

4

5

6

7

8

9

PUEApPUEAoPUE

BigDataBench SVM Sort Grep Linpack

17.2 11.5 269.9 179.7

Page 16: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Experiments on Different AlgorithmsExperiments on Different Algorithms

• Two Implementations for Sort– Several reducers with random sampling partitioning– One reducer without partitioning

10G 25G 50G 100G0

5

10

15

20

25

30PUE(Sort1)ApPUE(Sort1)PUE(Sort2)ApPUE(Sort2)

Data Size

Page 17: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

ConclusionsConclusions

• We analyze the requirements of application-level energy effectiveness metrics AxPUE in data centers.

• We propose two novel application-level metrics ApPUE and AoPUE to measure the energy consumed to improve the application performance.

• The experiment results show that AxPUE could provide meaningful guidance to data center design and optimization.

Page 18: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Evaluation of Performance & Energy Efficiency Tradeoff

Evaluation of Performance & Energy Efficiency Tradeoff

How to measure?AxPUE: Application Level Metrics for Power Usage Effectiveness in Data Centers

How to get balance?The Implications from Benchmarking Three Big Data Systems

Page 19: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

New SolutionsNew Solutions

……

Page 20: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Experimental PlatformsExperimental Platforms

Xeon (Common processor)

Atom ( Low power processor)

Tilera (Many core processor)CPU Type

Intel Xeon E5310 Intel Atom D510 Tilera TilePro36

CPU Core4 cores @

1.6GHz2 cores @ 1.66GHz

36 cores @ 500MHz

L1 I/D Cache 32KB 24KB 16KB/8KB

L2 Cache 4096KB 512KB 64KB

Basic InformationBrief Comparison

Page 21: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Benchmark SelectionBenchmark SelectionBigDataBench

A big data benchmark suite from big data applications

Respective applications

An innovative data generation tool

ApplicationTime

ComplexityCharacteristics

Sort O(n*log2n) Integer comparison

WordCount O(n)Integer comparison and

calculation

Grep O(n) String comparison

Naïve Bayes O(m*n) Floating-point computation

SVM O(n3) Floating-point computation

Page 22: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Metrics Metrics

Performance: Data processed per second (DPS)

Energy Efficiency: Application Performance Power Usage Effectiveness(DPJ)

Page 23: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon Atom Tilera

DPS

DPJ

General ObservationsGeneral Observations

Page 24: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

General ObservationsGeneral Observations

Data scale has a significant impact on the performance and energy efficiency of big data systems.

The performance and energy efficiency trends of different applications are diverse.

Xeon Atom Tilera

Page 25: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPSXeon VS Atom – DPS

Page 26: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPJXeon VS Atom – DPJ

Page 27: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Atom – DPS & DPJXeon VS Atom – DPS & DPJ500MB 1GB 10GB 25GB 50GB

100GB

SortDPSDPJ

3.670.87

4.511.08

1.890.45

1.540.36

1.360.32

1.400.33

WordcountDPSDPJ

2.270.55

2.380.58

2.740.61

2.840.61

2.820.62

2.790.60

GrepDPSDPJ

1.830.48

1.820.46

2.300.54

2.790.62

2.870.63

2.890.64

Naïve Bayes

DPSDPJ

3.830.89

3.890.87

4.521.01

4.640.99

4.540.97

4.580.90

SVMDPSDPJ

3.190.69

3.060.64

3.170.66

3.140.67

Xeon is more powerful than Atom on processing capacity.Atom is more energy –saving than Xeon when dealing

with simple computation logic applications.

Page 28: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Atom -- SummaryXeon VS Atom -- Summary

Xeon is more powerful than Atom on processing capacity.

Atom is energy conservation than Xeon when dealing with applications with simple computation logic.

Atom doesn’t show energy advantage when dealing with complex applications.

Page 29: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPSXeon VS Tilera – DPS

Page 30: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPJXeon VS Tilera – DPJ

Page 31: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera – DPS & DPJXeon VS Tilera – DPS & DPJ500MB 1GB 10GB 25GB

SortDPSDPJ

3.670.48

3.390.45

2.410.31

2.600.34

WordcountDPSDPJ

5.190.67

5.040.65

7.350.87

7.780.92

GrepDPSDPJ

3.600.51

3.520.48

7.450.94

9.931.21

Naïve BayesDPSDPJ

5.910.75

5.780.70

7.590.89

7.940.92

Xeon is more powerful than Tilera on processing capacityTilera is more energy-saving than Xeon when dealing with the simple computation logic and I/O intensive applicationsTilera don’t show energy advantage when dealing with complex applications

Page 32: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera Xeon VS Tilera

The DPS of XeonThe DPS of AtomThe DPS of Tilera

Page 33: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera Xeon VS Tilera

The DPS of Tilera

Tilera is more suitable to process I/O intensive applications

Page 34: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Xeon VS Tilera -- SummaryXeon VS Tilera -- Summary

36

Xeon is more powerful than Tilera on

processing capacity.

Tilera is more energy conservation than Xeon

when dealing with simple computation logic and

I/O intensive applications.

Tilera don’t show energy advantage when

dealing with complex applications.

Tilera is more suitable to process I/O intensive

applications.

Page 35: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

ImplicationsImplications

The performance of a big data system is not only related to the hardware itself, but also the application type and data volume of workloads.

The weak processors aren’t suitable to deal with complex applications. Even they have lower TDP, they don’t show energy cost advantage.

Page 36: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013

Implications Cont.Implications Cont.Xeon generally has better processing capacity accompanied with high energy consumption, especially to some light scale-out applications.

Atom and Tilera show energy consumption advantage when dealing with light scale-out applications.

Tilera exerts energy advantage on processing I/O intensive application.

Page 37: Performance and Energy Efficiency Evaluation of Big Data Systems Presented by Yingjie Shi Institute of Computing Technology, CAS 2013-10-31.

BPOE 2013 | HPCChina 2013