RawBench: A Comprehensive Benchmarking Framework for Raw Nanopore Signal Analysis Techniques

This article has 0 evaluations Published on
Read the full article Related papers
This article on Sciety

Abstract

Nanopore sequencing technologies continue to advance rapidly, offering critical benefits such as real-time analysis, the ability to sequence extremely long DNA fragments (up to millions of bases in a single read), and the option to selectively stop sequencing a molecule before completion. Traditionally, the raw electrical signals generated during sequencing are converted into DNA sequences through a process called basecalling, which typically relies on large neural network models. While accurate, these models are computationally intensive and often require high-end GPUs to process the vast volume of raw signal data. This presents a significant challenge for real-time processing, particularly on edge devices with limited computational resources, ultimately restricting the scalability and deployment of nanopore sequencing in resourceconstrained settings. Raw signal analysis has emerged as a promising alternative to these resource-intensive approaches. While attempts have been made to benchmark conventional basecalling methods, existing evaluation frameworks 1) overlook raw signal analysis techniques, 2) lack the flexibility to accommodate new raw signal analysis tools easily, and 3) fail to include the latest improvements in nanopore datasets. Our goal is to provide an extensible benchmarking framework that enables designing and comparing new methods for raw signal analysis. To this end, we introduce RawBench, the first flexible framework for evaluating raw nanopore signal analysis techniques. RawBench provides modular evaluation of three core pipeline components: 1) reference genome encoding (using different pore models), 2) signal encoding (through various segmentation methods), and 3) representation matching (via different data structures). We extensively evaluate raw signal analysis techniques in terms of 1) quality and performance for read mapping, quality and performance for read classification, and 3) quality of raw signal analysis-assisted basecalling. Our evaluations show that raw signal analysis can achieve competitive quality while significantly reducing resource requirements, particularly in settings where real-time processing or edge deployment is necessary.

CCS Concepts

Computing methodologiesBioinformatics; Evaluation methodologies; • Applied computingComputational genomics.

ACM Reference Format

Furkan Eris, Ulysse McConnell, Can Firtina, and Onur Mutlu. 2025. RawBench: A Comprehensive Benchmarking Framework for Raw Nanopore Signal Analysis Techniques. In Proceedings of the 16th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (BCB ‘25), October 11–15, 2025, Philadelphia, PA, USA. ACM, New York, NY, USA, 12 pages. <ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://doi.org/10.1145/3765612.3767302">https://doi.org/10.1145/3765612.3767302</ext-link>

Related articles

Related articles are currently not available for this article.