To main content

iPuma: High-Performance Sequence Alignment on the Graphcore IPU

Abstract

String alignment algorithms are an essential tool for understanding DNA and protein sequences. They demand substantial computation in real-world applications, and are thus a prime target for hardware acceleration. However, GPUs struggle to provide sufficient acceleration. Meanwhile, the recent MIMD-capable AI accelerators such as the Graphcore Intelligence Processing Unit (IPU) have become technologically viable. In this paper we present iPuma, a new implementation of Smith-Waterman sequence alignment for the IPU, which offers generalized short and medium length, one-to-one, and many-to-many high-throughput alignments for both DNA and protein sequences. iPuma is integrated into two bioinformatics pipelines, MetaHipMer2 and PASTIS. On protein datasets, iPuma shows speedups of 2.7x and 1.6x over state-of-the-art GPU and CPU implementations, respectively. We test the scalability on up to 64 IPUs, attaining a peak scoring performance of 1763 GCUPS for protein and 1168 GCUPS for DNA sequences.

Category

Academic chapter

Language

English

Author(s)

Affiliation

  • SINTEF Digital / Sustainable Communication Technologies
  • Charité - Universitätsmedizin Berlin
  • University of Bergen
  • University of Oslo
  • Simula Research Laboratory

Year

2024

Publisher

IEEE (Institute of Electrical and Electronics Engineers)

Book

ISC High Performance 2024 Research Paper Proceedings (39th International Conference)

ISBN

9783982633602

View this publication at Norwegian Research Information Repository