Exploring SIMD for molecular dynamics, using Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors

Simon J. Pennycook, Chris J. Hughes, M. Smelyanskiy, S. A. Jarvis

Research output: Contribution to conference (unpublished)Paperpeer-review

125 Citations (Scopus)

Abstract

We analyse gather-scatter performance bottlenecks in molecular dynamics codes and the challenges that they pose for obtaining benefits from SIMD execution. This analysis informs a number of novel code-level and algorithmic improvements to Sandia's miniMD benchmark, which we demonstrate using three SIMD widths (128-, 256- and 512-bit). The applicability of these optimisations to wider SIMD is discussed, and we show that the conventional approach of exposing more parallelism through redundant computation is not necessarily best. In single precision, our optimised implementation is up to 5x faster than the original scalar code running on Intel® Xeon® processors with 256-bit SIMD, and adding a single Intel® Xeon Phi™ coprocessor provides up to an additional 2x performance increase. These results demonstrate: (i) the importance of effective SIMD utilisation for molecular dynamics codes on current and future hardware, and (ii) the considerable performance increase afforded by the use of Intel® Xeon Phi™ coprocessors for highly parallel workloads.

Original languageEnglish
Pages1085-1097
Number of pages13
DOIs
Publication statusPublished - 2013
Event27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013 - Boston, MA, United States
Duration: 20 May 201324 May 2013

Conference

Conference27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013
Country/TerritoryUnited States
CityBoston, MA
Period20/05/1324/05/13

Keywords

  • accelerator architectures
  • high performance computing
  • parallel programming
  • performance analysis
  • scientific computing

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Exploring SIMD for molecular dynamics, using Intel® Xeon® processors and Intel® Xeon Phi™ coprocessors'. Together they form a unique fingerprint.

Cite this