Academia.eduAcademia.edu

Trends in high - performance computing

2006, IEEE Circuits and Devices Magazine

Proc. Fourth IDEA Workshop, Magnetic Island, 17-20 May 1997, and Technical Note DHPC-006. Trends in High Performance Computing K.A.Hawick Department of Computer Science, University of Adelaide, South Australia 5005. email: [email protected] 1. Introduction Perceptions in the High Performance Computing community of what HPC is really about have changed considerably in the last decade, with the emphasis having moved somewhat away from fast processing towards fast communications and distributed computing. What used to be referred to as ‘‘Supercomputing’’ became ‘‘High Performance Computing and Communications (HPCC)’’ in the USA and ‘‘High Performnce Computing and Networking (HPCN)’’ in Europe. The emphasis is now moving towards ‘‘Distributed Computing’’ and parallel computing groups worldwide are re-inventing themselves as distributed computing groups and centres. High performance storage is being recognized as a possible next hurdle as storage device speeds are no longer in balance with processing and communications speeds. A possible new acronym for the collective field is ‘‘Distributed, High Performance, Computing, Commmunications and Storage (DHPCCS)’’. It may be interesting to examine this refocus of direction and consider some trends in development of the hardware and software and in particular examine the applications areas that will need ‘‘DHPCCS’’ systems. 2. Application Categories In the distant past (early and mid 1980s) Supercomputer systems were largely driven by the science and engineering applications sectors. Large and expensive supercomputers were needed for applications such as: Computational Fluid Dynamics(CFD); Computational Electromagnetics(CEM); Molecular Dynamics; Quantum Electrodynamics (QED); and Quantum Chromodynamics (QCD). These applications are still there, requiring many machine cycles and have been joined by newer fields such as: Ab-Initio Chemistry; Solid State Physics; and Financial and Economic simulations. This group of applications might broadly be termed as numerical simulations applications and largely depend for there success on the fast processing or floating-point computational aspects of a supercomputer. Another group of applications depends for its success on the computers ability to store and manipulate large quantities of data. Applications such as: data mining; sesimic and statistical data analysis; market analysis; healthcare and legal administration, all require very large, reliable and fast access to stored data. These might be termed the storage-dominated applications. A relatively new applications area has drawn much attention in the 1990s. It might be broadly termed Services On-Demand and consists of applications which deliver information and rely heavily on substantial communications bandwidths between computers on a network. Examples include: online transaction processing; collaboratory systems - especially on the World Wide Web(WWW); text, audio and video services on-demand and even simulation on-demand. These applications are dominated by communications. These three areas above, clearly rely on all aspects of computing to an extent - compute, storage and communications - but appear to be groupable by the dominant characteristic. Other emerging sectors rely on highly complex combinations of these three characteristics and are in consequence more dependent on specially customised integrated computer systems. These systems of systems involve applications like: Decision Support for Individuals, companies and Governments; Command, Control and Intelligence (C2I) systems; and agile manaufacturing. The categories given above are expanded somewhat in the following list of examples (adapted from 2nd Pasadena Workshop): a)Information Simulation - Compute-Dominated 1 Computational Fluid Dynamics 2 Structural Dynamics 3 Electromagnetic Simulation 4 Scheduling 5 Environmental Modeling 6 Health and Biological Modeling 7 Basic Chemistry 8 Molecular Dynamics 9 Economic and Financial Modeling 10 Network Simulations 11 Particle Flux Transport Simulations 12 Graphics Rendering 13 Integrated Complex Simulations b)Information Repository - Storage-Dominated 14 Seismic Data Analysis 15 Image Processing 16 Statistical Analysis and Legal Data Inference 17 Healthcare and Insurance Fraud 18 Market Segmentation Analysis c)Information Access - Communications-Dominated 19 Online Transaction Processing (OLTP) 20 Collaboratory Systems (eg WWW) 21 Text on-Demand 22 Video on-Demand 23 Imagery on-Demand 24 Simulation on-Demand d) Information Integration - Systems of Systems 25 Command, Control and Intelligence (C2I) 26 Personal Decision Support 27 Corporate Decision Support 28 Government Decision Support 29 Real Time Control Systems 30 Electronic Banking 31 Electronic Shopping 32 Agile Manufacturing 33 Education 3. Applications Performance and Hardware It appears that there is a strong correspondence between the evolution of computing technology and trends in successful emerging applications areas. The hardware in particular seems to have had a large influence on the success of some applications areas. The simulation applications areas were stimulated by the vector and latterly massively parallel processors capable of providing fast (in absolute terms) floating point performance (originally) and good price/performance latterly. Data-mining or storage-dominated applications have been driven by the improved performance and especially price/performance of persistent storage media such as RAID and tape-robot technology. Communications-dominated applications such as WWW-based collaboratory tools and on-line services on-demand originally grew steadily with proprietary LAN technology but appear to be growing exponentially with the widespread adoption of Internet technology for WANs. For the most part these applications have a static allocation of resources and owe their success to performance of those compute resources. The interesting and challenging applications of the future are likely to be those that make use of complex combinations of compute, storage and communications - possibly making tradeoffs dynamically depending on available resources. Complex systems of systems, running on distributed computing systems, making dynamic resource allocation choices will rely much more heavily on software integration than applications that could be optimised for a static hardware system. 4. Trends in DHPCCS A study of trends in advanced computing hardware reveals a number of major milestones which are hard to place exact dates against. The following ordering is a personal perception. Serial Computing Era (IBM Mainframes and competitors) Vector Computers (Cray and imitators) SIMD Parallelism (AMT DAP, Thinking Maschines CM, Maspar) MIMD Parallelism (Transputers and other proprietary chip combinations) Workstations (Sun and Competitors) Massively Parallel Processors (MPPs) of various fixed topologies (Hypercubes, Tori, meshes) Personal Computers (IBM, Intel and the Microsoft Saga) Emergence of commercial MPPs Commodity Chips gradually take over in MPPs Networks of Workstations Large Scale Shared Memory Machines Fail Enterprise Servers use small scale shared memory technology SPMD/Data Paralleism (MPI and HPF) become accepted parallel models WANs become accessable to universities (ATM technology) Distributed Computing Replaces Parallel Computing as the trendy area Clent/Server Computing become a widespread Software Model Distributed Objects, Java and Agent/Servelet Model becomes popular Possibly the most significant legacy of these changes is the casualties of war list of architectures of the past. An incomplete list of these together witha description of the organizations that manufactured them is given as part of the USA National HPCC Software Exchange. The general trend seems to be away from specialisation in hardware that is highly optimised for once aspect of performance and towards distributed computing hardware. The software trend is slower, but is hopefully towards better standardised software systems such as MPI and HPF for exploiting parallelism. It is not yet clear what combination of Java and/or CORBA will persist as the software model for distributed computing or if some new paradigm will emerge will useable performance efficiency. 5. Software for Distributed, High-Performance Computing To address the software challenges of applying distributed, high-performance computing, a number of parallel computing centres are re-positioning themselves as distributed computing groups. The Distributed, High-Performance Computing Projects at Adelaide University are described in detail elsewhere, but include an infrastructure research and development project and a series of focussed demonstrator project covering the areas of geographic information systems and environmental applications. Some of the interesting results that have emerged from this work so far concern: the construction and performance analysis of an ATM WAN; design of data repositories and processing systems for geographic applications; and the construction of client/server strategies and models for making these services available. 6. Conclusions In general it appears that the field of High Performance Computing is still changing with a definite move towards emphasis on distributed computing with parallelism and other technologies for optimising aspects of performance such as storage and communications being embedded components in this broader field. In summary: ‘‘Parallel’’ is no longer enough, and all aspects must be recognized for performance. DHPCCS may be an appropriate new acronym to recognize that Communications and Storage are equally important as Computation. Integration software for interoperability is a most interesting challenge. The Ongoing Mission for DHPCCS is the make the system more useable - that is more flexible, so that performance can be obtained but also the software maintained. 7. References 1. ‘‘DISCWorld - Distributed Information Systems Cloud of High Performance Computing Resources - Concepts Discussion Document’’, K.A.Hawick and F.A.Vaughan, DHPC Working Note, December 1996. 2. ‘‘2nd Pasadena Workshop, Working Group 2 Report’’, G.C.Fox, K.A.Hawick, A.B.White, December 1995. 3. ‘‘Survey of High Performance Computing Systems in the National High Performance Computing Software Exchange’’, Ken Hawick, Jack Dongarra, Geoffrey Fox, 1994. 4.‘‘Java as Persistent Glue’’, F.A.Vaughan, IDEA Workshop, May 1997. 5. ‘‘Geostationary-Satellite Imagery Applications on Distributed, High-Performance Computing’’, K.A.Hawick, H.A.James, K.J.Maciunas, F.A.Vaughan, A.L.Wendelborn, M.Buchhorn, M.Rezny, S.R.Taylor and M.D.Wilson, Proc. High Performance Computing Asia, May 1997. 6. ‘‘Geographic Information Systems Applications on an ATM-Based Distributed High Performance Computing System’’, K.A.Hawick, H.A.James, K.J.Maciunas, F.A.Vaughan, A.L.Wendelborn, M.Buchhorn, M.Rezny, S.R.Taylor and M.D.Wilson, Proc. High Performance Computing and Networks, Vienna, May 1997. 7. ‘‘An ATM-based Distributed High Performance Computing System’’ K.A.Hawick, H.A.James, K.J.Maciunas, F.A.Vaughan, A.L.Wendelborn, M.Buchhorn, M.Rezny, S.R.Taylor and M.D.Wilson, Proc. High Performance Computing and Networks, Vienna, May 1997. 8. ‘‘The Distributed High Performance Computing Projects’’ a Project Funded by the Research Data Networks Cooperative Research Centre of the Autralian Commonwealth Government, and managed by the Advanced Computational Systems CRC. 9. ‘‘Distributed Geographic Information Systems: Project Concepts Discussion Document’’, DHPC Project Team, November 1996. 8. Acknowledgements The author acknowledges support provided by the Cooperative Research Centres (CRC) for Research Data Networks and Advanced Computational Systems, established under the Australian Government’s CRC Programme. |CS Dept., Adelaide | DHPC Project | Ken Hawick | IDEA Workshop |