Skip navigation links

May

03

3105 Engineering Building and Zoom

Doctoral Defense - Abdullah Alperen

Register
the famous Belmont tower facing a sunset

About the Event

The Department of Computer Science & Engineering

Michigan State University

Ph.D. Dissertation Defense

May 3, 2024 at 10:00am EST

3105 Engineering Building and Zoom
Contact Department or Advisor for Zoom Information

 

Accelerating Sparse Eigensolvers Through Asynchrony, Hybrid Algorithms and Heterogeneous Architectures

By: Abdullah Alperen

Advisor: Dr. Hasan Metin Aktulga

Sparse matrix computations comprise the core component of a broad base of scientific applications in fields ranging from molecular dynamics and nuclear physics to data mining and signal processing. Among sparse matrix computations, the eigenvalue problem has a significant place due to its common use in the area of high performance scientific computing. In nuclear physics simulations, for example, one of the most challenging problems is solving large-scale eigenvalue problems arising from nuclear structure calculations. Numerous iterative algorithms have been developed to solve this problem over the years.

Lanczos and locally optimal block preconditioned conjugate gradient (LOBPCG) are two of such popular iterative eigensolvers. Together, they present a good mix of the computational motifs encountered in sparse solvers. With this work, we describe our efforts to accelerate large-scale sparse eigensolvers by employing asynchronous runtime systems, the development of hybrid algorithms and the utilization of GPU resources.

We first evaluate three task-parallel programming models, OpenMP, HPX and Regent, for Lanczos and LOBPCG. We demonstrate these asynchronous frameworks’ merit on two architectures, Intel Broadwell (a multicore processor) and AMD EPYC (a modern manycore processor). We achieve up to an order of magnitude improvement both in execution time and cache performance.

We then examine and compare a few iterative methods for solving large-scale eigenvalue problems arising from nuclear structure calculations. In particular, besides Lanczos and LOBPCG, we discuss the possibility of using block Lanczos method and the residual minimization method accelerated by direct inversion of iterative subspace (RMM-DIIS). We show that RMM-DIIS can be effectively combined with either block Lanczos and LOBPCG to yield a hybrid eigensolver that has several desirable properties.

We finally demonstrate the challenges posed by the emergence of accelerator-based computer architectures to achieve high performance for large-scale sparse computations. We particularly focus on the scalability of sparse matrix vector multiplication (SpMV) and sparse matrix multi-vector multiplication (SpMM) kernels of Lanczos and LOBPCG. We scale their performance up to hundreds of GPUs by improving their computation and communication aspect through hand-optimized CUDA kernels and hybrid communication methods.

Tags

Doctoral Defenses

Date

Friday, May 03, 2024

Time

10:00 AM

Location

3105 Engineering Building and Zoom

Organizer

Abdullah Alperen