TY - GEN
T1 - To upgrade or not to upgrade? Catamount vs. Cray linux environment
AU - Hammond, S. D.
AU - Mudalige, G. R.
AU - Smith, J. A.
AU - Davis, J. A.
AU - Jarvis, S. A.
AU - Holt, J.
AU - Miller, I.
AU - Herdman, J. A.
AU - Vadgama, A.
PY - 2010
Y1 - 2010
N2 - Modern supercomputers are growing in diversity and complexity - the arrival of technologies such as multi-core processors, general purpose-GPUs and specialised compute accelerators has increased the potential scientific delivery possible from such machines. This is not however without some cost, including significant increases in the sophistication and complexity of supporting operating systems and software libraries. This paper documents the development and application of methods to assess the potential performance of selecting one hardware, operating system (OS) and software stack combination against another. This is of particular interest to supercomputing centres, which routinely examine prospective software/architecture combinations and possible machine upgrades. A case study is presented that assesses the potential performance of a particle transport code on AWE's Cray XT3 8,000-core supercomputer running images of the Catamount and the Cray Linux Environment (CLE) operating systems. This work demonstrates that by running a number of small benchmarks on a test machine and network, and observing factors such as operating system noise, it is possible to speculate as to the performance impact of upgrading from one operating system to another on the system as a whole. This use of performance modelling represents an inexpensive method of examining the likely behaviour of a large supercomputer before and after an operating system upgrade; this method is also attractive if it is desirable to minimise system downtime while exploring software-system upgrades. The results show that benchmark tests run on less than 256 cores would suggest that the impact (over-head) of upgrading the operating system to CLE was less than 10%; model projections suggest that this is not the case at scale.
AB - Modern supercomputers are growing in diversity and complexity - the arrival of technologies such as multi-core processors, general purpose-GPUs and specialised compute accelerators has increased the potential scientific delivery possible from such machines. This is not however without some cost, including significant increases in the sophistication and complexity of supporting operating systems and software libraries. This paper documents the development and application of methods to assess the potential performance of selecting one hardware, operating system (OS) and software stack combination against another. This is of particular interest to supercomputing centres, which routinely examine prospective software/architecture combinations and possible machine upgrades. A case study is presented that assesses the potential performance of a particle transport code on AWE's Cray XT3 8,000-core supercomputer running images of the Catamount and the Cray Linux Environment (CLE) operating systems. This work demonstrates that by running a number of small benchmarks on a test machine and network, and observing factors such as operating system noise, it is possible to speculate as to the performance impact of upgrading from one operating system to another on the system as a whole. This use of performance modelling represents an inexpensive method of examining the likely behaviour of a large supercomputer before and after an operating system upgrade; this method is also attractive if it is desirable to minimise system downtime while exploring software-system upgrades. The results show that benchmark tests run on less than 256 cores would suggest that the impact (over-head) of upgrading the operating system to CLE was less than 10%; model projections suggest that this is not the case at scale.
UR - http://www.scopus.com/inward/record.url?scp=77954067671&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2010.5470885
DO - 10.1109/IPDPSW.2010.5470885
M3 - Conference contribution
AN - SCOPUS:77954067671
SN - 9781424465347
T3 - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
BT - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
T2 - 2010 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2010
Y2 - 19 April 2010 through 23 April 2010
ER -