Cray and Intel Expand Parallelism at Kyoto University

Kyoto University (Kyoto U) is a world-class research and education institution with campuses in Japan and extended schools around the world. The university has a broad research community that relies on supercomputing to contribute to worldwide knowledge across many disciplines—economics, weather and climate, and genetics, among others. "We support a wide variety of research fields," said Hiroshi Nakashima, Professor of Academic Center for Computing and Media Studies (ACCMS) at Kyoto University. "Our supercomputers are open to any HPC researcher in Japan, so they are very general-purpose."

Nakashima leads ACCMS' Supercomputing Operation Committee. His department is responsible for the acquisition and operation of Kyoto U's supercomputers. "As a professor, I'm also pursuing various research work on supercomputing, mainly in high-performance programming. Some of my research topics, such as a framework for manycore-aware particle simulations with automatic load balancing, are being pursued in collaboration with our supercomputer users."

In 2015, Kyoto U's aging supercomputing complex was in need of refresh. Their systems were based on Intel and other processors, and some were several generations old. Their specifications for a new system required a mix of dual-socket and four-socket x86 processors "with Intel® Xeon® processor (formerly known as Haswell- or Broadwell) class performance," stated Nakashima. They were looking in particular at the value of increased parallelism by exploiting the advancements in SIMD operations. University users had already been taking advantage of MPI and OpenMP programming models to increase parallelism in their codes. "We were well aware of 256- and 512-bit SIMD-vectorized computations," explained Nakashima. "Our belief is that the improvement of an application to exploit the wide SIMD mechanism will be consistent with the past improvements with MPI and OpenMP, because the SIMD mechanism allows us to make gradual improvements without discontinuous programming changes, such as with CUDA or OpenACC," he added. So, for their new system, they also specified nodes with Intel® Xeon Phi™ processor-class performance.