# 2020: Modern Techniques and Algorithms in HPC

The 20th anniversary edition of the Geilo Winter Schools in eScience took place in Geilo, Norway. The school started in the afternoon on Sunday January 19, 2020 and ended on Friday 24th after lunch.

## Abstract

Modern multicore and parallel CPU and GPU architectures have made it possible to handle problems that were far out of reach not so long ago. Moreover, there is an increasing interest in quantum computers, which have the potential of opening entirely new avenues of algorithmic research. Exploiting the numerical potential of modern computing platforms requires knowledge and training in high-performance computing (HPC).

The 20th Geilo Winter School in eScience will cover modern trends in HPC. This includes abstractions in HPC codes, GPU computing with CUDA, parallel computing, and applications. We will also learn about automatic differentiation techniques, which are integral in applications such as deep learning and many branches of numerical simulation. In addition, we will get the opportunity to learn more about quantum computing and its practical applications.

As in previous years, the school will consist of lectures and tutorials including interactivity, hands-on examples and computer exercices. There will also be a poster session where we strongly encourage participants to present their own research. The overall goal is that participants will leave the school with a set of new tools in their toolbox, and new connections to collaborate with in the future.

## Program

**Andreas Kloeckner**

Software for large-scale simulation problems is stretched between three key requirements: high-performance, typically parallel implementation, optimal algorithms, and often highly technical application domains. This tension contributes considerably to making HPC software difficult to write and hard to maintain. If you are faced with this problem, this class can help you find and develop possible solution approaches.

We will be focusing on GPUs as an important target architecture for such programs that pose unique challenges to programmers. Why do they look the way they do? What design pressures influence their behavior? How do we talk to them? What abstractions exist to make them easier to program? Focusing on OpenCL (and complemented by accompanying lectures by Johannes Langguth on the very similar CUDA), we will learn how to build high-performance programs in a high-level programming language Python, and then how to build tools that can help make the writing of such programs easier.

- Why GPUs? GPU programming model, basic abstraction (grids/work groups), basic PyOpenCL, n-dimensional arrays, GPU parallel primitives (map, reduce, scan)
- Intra-core data exchange and synchronization, inter-kernel/host synchronization, GPU performance, warp/wavefront scheduling, issue slots, concurrency vs latency (occupancy, register file)
- OpenCL abstract machine model, mapping to CPUs, ISPC, code transformation, abstraction building

**Barak Pearlmutter**

**Theory:**Fundamental ideas of AD: the Jacobian as a generalized sparse matrix; forward and reverse accumulation modes; compositionality; nonstandard interpretation of computer programs; numeric basis; other modes (D*, higher-order forward mode, cross-country elimination, checkpoint reverse); implementation techniques (overloading, taping, source-to-source transformation); tradeoffs; nesting; existing systems.**Practice**: We will use a production-quality system to write some code that uses AD. The focus will be on small datasets and toy examples of machine learning methods that are easier with AD: not just backpropagation, but end-to-end optimization, models including physical simulations, differential equations, and multilevel models like hyperparameter optimization and learning-to-learn.**Frontiers and Gotchas:**Frontiers of AD: formalization and proofs; theory that hasn't made its way into practice yet (like "while loops"); things that are not really automatic but should be; numeric issues introduced by AD; performance issues; gotchas with current systems; limitations of current methods and systems and ideas.

**Franz Fuchs**

Quantum computers have the potential to outperform classical computers on certain classes of problems and solve problems that are intractable even on any future (classical) supercomputer. The development of chips for quantum computers has been following Moore’s law in the last years and the technology is edging ever closer to commercialization. Most of these small scale quantum computers are accessible through cloud interfaces today. The objective of these lectures is to give an introduction to quantum computing including hands-on experience. During the week, the lectures will cover the following.

- Session I: I will give an overview of the field of quantum computing and an introduction to the fundamental (mathematical) concepts necessary to understand quantum computing.
- Session II: The first half will be a practical coding session covering the fundamental concepts of quantum computing. In the second half I will introduce some basic quantum algorithms.
- Session III: The first half consists again of a practical coding session. We will implement some of the quantum algorithms from session II and execute them on simulators. The second half will describe methods to mitigate the inherent errors of noisy small and intermediate scale quantum computers.

**Susanne Kunkel**

NEST is a simulator for large-scale spiking neuronal networks with a history dating back to the late 1990s. The definition of what is considered a large-scale network has changed since, as ever more powerful HPC facilities have become available.

Contemporary supercomputers even provide the resources to represent brain-scale spiking neuronal networks. However, in order to enable neuronal simulators to exploit such resources, simulation code had to undergo fundamental design changes. Today, NEST works efficiently for a broad variety of models and on various platforms, from laptops to supercomputers. In the first lecture, I will give an introduction to the NEST simulator and the development infrastructure around it. In the second lecture, I will present the technological advances of the last years that have made NEST such an extremely scalable simulation tool.

**Johannes Langguth**

In recent years, the world of high performance computing has become more and more heterogeneous, with GPUs replacing CPUs as the main driver of computational performance.

We will be looking at GPU programming as a central building block for larger applications that make use of todays supercomputers. What are the low-level features that determine performance and how can we exploit them within a CUDA program. How can we divide work between multiple GPUs at the node and system level? How can we use today’s middleware to simplify these tasks? And what future developments should we be prepared for?

- More on GPUs. Using CUDA for high performance computing. How does the GPU architecture affect programming and performance?
- Irregular and Heterogeneous computations. How can we make CPU and GPU work together?
- Multi GPU programming. How to exchange data between the GPUs?
- Multi node programming. Using MPI to communicate within supercomputers.
- Advanced data exchange: PGAS and CUDA-aware MPI.
- Beyond GPUs: manycore, IPUs, and the future of high-performance computing.

## Lecturers

Andreas Klöckner is an associate professor in the scientific computing group within the Department of Computer Science at the University of Illinois at Urbana-Champaign. He works on high-order numerical methods for the simulation of wave problems as well as issues in high-performance scientific computing that relate to bringing these methods to life on large-scale parallel computers. In support of his research, Dr. Klöckner has released numerous scientific software packages. He obtained his PhD degree at the Division of Applied Mathematics at Brown University with Jan Hesthaven, working on large-scale finite element simulations of wave problems in the time domain. From Brown, Klöckner moved to the Courant Institute of Mathematical Sciences at New York University as a Courant Instructor, where he worked on integral equation methods and fast algorithms within Leslie Greengard’s group.

Prof Barak Pearlmutter received a BS in Mathematics from CWRU, a PhD in Computer Science from Carnegie Mellon University (where he worked on neural networks the second time they were cool) and postdoctoral training in Neuroscience at Yale University. He is currently in the Department of Computer Science at Maynooth University, in Ireland. His main current research interests are two-fold: understanding information processing in the brain, and figuring out how to build artificial systems that exhibit brain-like performance. The focus of the former is currently on exploring criticality in the brain, while the later is upon building mathematical formalizations and programming languages that support the construction of complex adaptive systems. To that end, he has been collaborating with Prof Jeffrey Mark Siskind on building theoretical frameworks and prototype implementations of systems that support more general, more robust, and more performant first-class automatic differentiation.

Franz G. Fuchs is a research scientist at SINTEF and currently working in the field of quantum computing, with particular focus on practical applications on noisy intermediate-scale quantum (NISQ) devices. He completed his master’s degree in mathematics from the Technical University of Munich in 2006 and earned his PhD in applied mathematics from the University of Oslo in 2009.

Susanne Kunkel works as a PostDoc at the Norwegian University of Life Sciences (NMBU). She received a Diploma in Bioinformatics from the University of Jena, Germany and then started her doctoral research in Neuroinformatics at the Bernstein Center Freiburg, Germany. She received her doctoral degree (Dr. rer. nat.) from the University of Freiburg in 2015. Before moving to Norway, she worked at the Jülich Research Centre, Germany and at KTH Stockholm, Sweden. Susanne is one of the core developers of NEST, a simulator for large-scale spiking neuronal networks.

Johannes Langguth is a research scientist at Simula Research Laboratory, Oslo, Norway. He received his PhD in computer science from the University of Bergen in 2011, followed by a postdoctoral appointment at ENS Lyon, France. His research interests include computer architecture, parallel algorithms, computational social science, combinatorial algorithms, and high-performance scientific computing on multi-core CPUs and GPUs.

## Schedule, lecture notes and materials

Lecture notes:

- Fuchs: Slides, Code
- Kloeckner: Slides, Code
- Kunkel: Slides 1, Slides 2, Code
- Langguth: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6

Posters:

- Håvard H. Holm, André R. Brodtkorb and Martin L. Sætra: Drift Trajectory Predictions using Massive Ensembles of Simplified Ocean Models
- Benjamin T. Speake, Andrew Johnson, Tom J. P. Irons, Grègoire David, Meilani Wibowo and Andrew M. Teale: An Embedded Fragment Approach to Large Molecular Clusters in Strong Magnetic Fields
- Ivar T. Hovden, Ingrid Digernes, Oliver M. Geier, Grethe Løvland, Einar Vik-Mo, Torstein R. Meling and Kyrre E. Emblem: The Impact of EPI-based Distortion Correction of Dynamic Susceptibility Contrast MRI in Patients with Glioblastoma
- M. Borregales, S. Krogstad and K.-A. Lie: A very simple data-driven model based on flow diagnostics for reservoir management
- Magnus Ulimoen: Estimation of Source Parameters of Radionuclides applied to the
^{106}Ru case - Rebecca Robinson, Mats Carlsson and Carlos Quintero Noda: The WHOLE SUN Project: Understanding the Energy Budget of the Solar Atmosphere
- Sangita Sen and Erik Tellgren: Electron Correlation, Excitation and Magnetic Fields

## Important information

See the About page for general information about the winter school.

## Applying for the winter school

There is no registration fee for the winter school, but participants must cover their own travel costs and hotel costs at Dr. Holms. You only have to register on the __registration page__.

## Cost of participating

There is no registration fee for the winter school, but participants must cover their own travel costs and hotel costs at Dr. Holms.

## Room allocation

The winter school has a limited number of rooms at Dr. Holms which will be reserved on a first come first served basis. **We have in the previous years exceeded our room allocation, so please register as early as possible**!

## Posters

The winter school welcomes all posters to be presented. The aim of the poster session is to make new contacts and share your research, and it is an informal event. You need to indicate in your registration if you want to present a poster during the poster session. Please limit your poster to A0 in portrait orientation.

## Organizing Committee

The organizing committee for the Geilo Winter School consists of

- Torkel Andreas Haufmann, Research Scientist (Department of Mathematics and Cybernetics, SINTEF).
- Øystein Klemetsdal, Research Scientist (Department of Mathematics and Cybernetics, SINTEF).
- Signe Riemer-Sørensen, Research Scientist (Department of Mathematics and Cybernetics, SINTEF).
- André R. Brodtkorb, Scientist (Division for Climate Modelling and Air Pollution, Norwegian Meteorological Institute), Associate Professor (Department of Computer Science, Oslo Metropolitan University).

To get in touch, please contact Torkel at .