Brainware For Green HPC

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/225235067

Brainware for green HPC

Article  in  Computer Science - Research and Development · August 2011


DOI: 10.1007/s00450-011-0198-5

CITATIONS READS

19 300

3 authors, including:

Christian Bischof Dieter an Mey


Technische Universität Darmstadt RWTH Aachen University
361 PUBLICATIONS   6,491 CITATIONS    64 PUBLICATIONS   1,295 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

MATSE Ausbildung View project

Software Factory 4.0 View project

All content following this page was uploaded by Christian Bischof on 31 May 2014.

The user has requested enhancement of the downloaded file.


Brainware for Green HPC
ENA-HPC
International Conference on Energy-Aware High Performance Computing
September 07–09, 2011
Hamburg

Christian Bischof
[email protected]
Dieter an Mey, Christian Iwainsky
{anmey,iwainsky}@rz.rwth-aachen.de

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 1


Overview

 TCO of HPC and Impact of Brainware

 Brainware Complexity

 HECToR dCSE Success Stories

 A Throughput Case Study

 Summary

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 3


Overview

 TCO of HPC and Impact of Brainware

 Brainware Complexity

 HECToR dCSE Success Stories

 A Throughput Case Study

 Summary

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 4


Total Cost of Ownership for HPC as a Service

 Assumptions costs per year percentage


 2 Mio € HW investment Building
per year ( 7.5Mio / 25y) 300.000 € 5%
 5 years lifetime with Investment compute
servers 2.000.000 € 36%
4 years maintenance
hardware maintenance 800.000 € 14%
through vendor
 850 KW, Power 1.564.000 € 28%
PUE=1.5, Linux 0€ 0%
0.14€ per kWh Batch system 100.000 2%
=> 1.5 Mio € per year
ISV software 0€ 0%
 ISV software provided by
users HPC software 50.000 € 1%
 Commercial batch system Staff 12 FTE 720.000 € 13%
 Free Linux distribution Sum 5.354.000 € 100%
 4 FTE are for “brainware”

Code Performance does not matter for TCO calculation

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 5


Usage Distribution

Accumulated usage of top accounts


100,00%

90,00%

80,00%

70,00%

60,00%

50,00%

40,00%

30,00%

20,00%

10,00%

0,00%
13
1
7

19
25
31
37
43
49
55
61
67
73
79
85
91
97

145
151
103
109
115
121
127
133
139

157
163
169
175
181
187
193
199
Accumulated usage of top accounts

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 6


Does it pay to hire HPC Experts? – 1 of 2

 Start tuning top user projects first


 15 projects account for 50% of the load

 64 projects account for 80% of the load

 Assumptions
 It takes 2 months to tune one project

 One analyst can handle 5 projects per year

 A projects profits for 2 years

 As a consequence one HPC expert


can on average take care of 10 projects at a time in a year

 One FTE costs 60,000€

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 7


Does it pay to hire HPC Experts? – 2 of 2

600.000
For example:
Break even point:
7.5 HPC Analysts
400.000
improve top 75
projects by 10%

200.000

ROI
[€]
0
#Projects

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160
-200.000 # of tuned projects
Savings with 5% improvement
Savings with 10% improvement 10 projects handled
-400.000
Savings with 20% improvement by one FTE (60.000€/y)

-600.000

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 8


The Impact of Brainware

 Brainware: Tuning Experts enhancing software performance and


software life cycle in light of changing operating environments.

 Even very moderate improvements in computational efficiency result in


considerable savings.

 For example, a rather minuscule improvement of 5% on the top 30


projects "pays" for three HPC specialists.
 If the performance is improved by 20 %, 0.5 Mio € are saved.

 Energy savings account for a substantial part of the gain thus


realized, i.e. brainware is an essential ingredient of green
computing.

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 9


Overview

 TCO of HPC and Impact of Brainware

 Tuning Complexity

 HECToR dCSE Success Stories

 A Throughput Case Study

 Summary

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 10


Opportunities for Tuning
without Code Access

 Sanity Check
 Use HW Counters
 Employ Performance Analysis Tools
 IO behavior
 System call statistics
 Hardware
 Choose the optimal hardware platform
 File system, IO parameters
 Parameterization
 Choose optimal number of threads / MPI processes
 Thread / Process Placement (NUMA)
 Mapping MPI topology to hardware topology
 MPI parameterization (buffers, protocols)
 Optimal libraries (MKL …)

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 11


Opportunities for Tuning
with Code Access

 Without Code Changes


 Choose the optimal compiler and optimal compiler options
 Autoparallelization, compiler profile / feedback
 Adapt dataset – partitioning / blocking – load balancing
 Cache Tuning
 padding, blocking, loop based optimization techniques, inlining/outlining
 MPI optimization
 Avoid global synchronization, coalesce communications
 Hide / reduce communication overhead, Unblocking communications
 OpenMP optimization
 Extend parallel regions, avoid false sharing
 NUMA optimization: first touch, migration
 In vogue: Add OpenMP to an MPI code to improve scalability
 Of Course: Crucial to choose the optimal algorithm
 To be handled by or with the domain expert

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 12


Building Brainware

 The skills just shown are typically not taught to code


developers.
 It takes experience and skill to pick the most efficient tuning
path & tools on a particular hardware platform.
 As academic computing is typically “free”, appreciation for
those skills is often lacking.
 As a result, “tuning expert” is a rare career path at academic
institutions.

 Unless brainware becomes a standard ingredient in HPC


operations (i.e. software is viewed as part of HPC
infrastructure), money is being wasted.

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 13


Overview

 TCO of HPC and Impact of Brainware

 Tuning Complexity

 HECToR dCSE Success Stories

 A Throughput Case Study

 Summary

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 14


HECToR Computational Science &
Engineering Service

 HECTor is the UK supercomputer service (Cray XE6 System).


http://www.hector.ac.uk/cse

 Part of the procurement was a service to make sure that users were
supported/trained in making good use of the hardware

 This bid was won by the Numerical Algorithms Group (NAG).


 Central component (staff at NAG):
 Advice on using the system, non-invasive tuning, profiling
 Distributed component (staff on-site)
 Panel reviews proposals for code improvement (software engineering, not
implementation of new science).

 Early grants up to 2 yrs, currently up to 1 yr.

 Contracts (not grants) awarded


Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 15
HECToR Distributed CSE Service
Success Stories
Code Domain Effect Effort Saving
CASTEP Key Materials Science 4x Speed and 4x Scalability 8 PMs 320k - 480k £ (p.a.)
NEMO Oceanography Speed and I/O-Perform. 6 PMs 95 k £ (p.a.)
CASINO Quantum Monte- 4x Performance and 12 PMs 760 k £ (p.a.)
Carlo 4x Scalability
CP2K Materials Science 12 % Speed and Scalability 12 PMs 1500 k £ (in total)
GLOMAP/ Atmospheric 15 % Performance ?
TOMCAT Chemistry
CITCOM Geodynamic Thermal 30% Performance ? significant
Convection
Incompact Fluid Turbulence 6.75x Speed and 16x Scala- 12 PMs
3D bility
ChemShell Catalytic Chemistry 8x Performance 9 PMs
Fluidity- Ocean Modelling Scalability ?
ICOM
DL_POLY_3 Molecular Dynamics 20x Performance 6 PMs
CARP HeartIwainsky
Bischof, an Mey, Modelling 20x Performance
Brainware 8 PMs
for Greenhttp://www.hector.ac.uk/cse/reports/
HPC Slide 16
Overview

 TCO of HPC and Impact of Brainware

 Tuning Complexity

 HECToR dCSE Success Stories

 A Throughput Case Study

 Summary

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 17


Impact of Brainware on Throughput

 XNS code, developed at the Institute for Computer Analysis


of Technical Systems at RWTH Aachen University (Prof. M.
Behr, www.cats.rwth-aachen.de)
 Parallel finite element (FE) solver
 Satisfactory scalability on up to 4096 processors on a Blue
Gene/L system using MPI parallelization.
 Also extensively used in parameter studies involving
smaller problems on a few cluster nodes.
 In an effort off roughly six weeks, nine parallel regions were
introduced into the most compute intense program parts.
 Experimental Results on QDR Infiniband-Cluster, nodes with
two Nehalem EP processors each (3 GHz, 4 cores per
processor chip). Serial time ~ 20 Minutes.

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 20


XNS: Impact of Hybrid Parallelization

100,00% 100,00%
93,40%
Efficiency (best effort)
80,00%
74,10%

60,00% 59,35%

49,53%

40,00% 38,93%
33,21% 32,95%
Performance
20,00% Improvement 19,79% 20,98% 20,53%
15,52% 17,46% 15,95%
12,67%
8,57%
0,00%
1 2 4 6 8 16 32 48 64
# Compute Nodes
-14,01%
-16,19%
-20,00%

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 21


Impact of Brainware on Throughput
Computing

 Interested in the impact of code tuning on configurations


where the parallel efficiency is relatively high (i.e. adding
hardware is an economically sensible way to improve code
performance)

 If we accept a decline of efficiency down to 50 percent, then


the tuning effort delivers an improvement of up to 39 percent
on 8 nodes.

 So brainware is as important for capacity computing as it


is for capability computing.

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 22


Overview

 TCO of HPC and Impact of Brainware

 Tuning Complexity

 HECToR dCSE Success Stories

 A Throughput Case Study

 Summary

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 23


Summary

 We need to take a holistic view of cost effectiveness and computing


efficiency: It makes more sense to invest in brainware rather than
buy more inefficiently used “green” hardware.

 Higher investment in brainware pays off.

 HPC experts are a rare species requiring extensive training.

 Current and upcoming architectures require even more expertise


(e.g. vector/multicore/distributed/cloud programming paradigms)
so the brainware component becomes ever more important.

 HPC funding policies, educational curricula, and career


development paths must recognize need for brainware.

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 24


Acknowledgements

 Thanks to
N. Berr, J. Dietter, A. ElShekh, I. Ierotheou, C. Iwainsky, L. Jerabkova,
S. Johnson, A. Gerndt, J.H. Goebbert, T. Haarmann, I. Hörschler, P. Leggett,
D. Schmidl, Z. Peng, H. Pflug, T. Preuß, S. Sarholz, S. Siehoff, A. Spiegel,
A. Wolf

Bischof, an Mey, Iwainsky Brainware for Green HPC Slide 25


View publication stats

You might also like