Accurate genotyping of three major respiratory bacterial pathogens with ONT R10.4.1 long-read sequencing
- Nora Zidane1,
- Carla Rodrigues1,2,
- Valérie Bouchez1,2,
- Martin Rethoret-Pasty1,
- Virginie Passet1,3,
- Sylvain Brisse1,2,3 and
- Chiara Crestani1
- 1Institut Pasteur, Université Paris Cité, Biodiversity and Epidemiology of Bacterial Pathogens, 75015 Paris, France;
- 2Institut Pasteur, National Reference Center for Whooping Cough and Other Bordetella Infections, 75015 Paris, France;
- 3Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Species Complex, 75015 Paris, France
Abstract
High-throughput massive parallel sequencing has significantly improved bacterial pathogen genomics, diagnostics, and epidemiology. Despite its high accuracy, short-read sequencing struggles with the complete genome reconstruction and assembly of extrachromosomal elements such as plasmids. Long-read sequencing with Oxford Nanopore Technologies (ONT) presents an alternative that offers benefits including real-time sequencing and cost efficiency, particularly useful in resource-limited settings. However, the historically higher error rates of ONT data have so far limited its application in high-precision genomic typing. The recent release of ONT's R10.4.1 chemistry, with significantly improved raw read accuracy (Q20+), offers a potential solution to this problem. The aim of this study is to evaluate the performance of ONT's latest chemistry for bacterial genomic typing against the gold-standard Illumina technology, focusing on three respiratory pathogens of public health importance, Klebsiella pneumoniae, Bordetella pertussis, and Corynebacterium diphtheriae, and their related species. Using the Rapid Barcoding Kit V14, we generate and analyze genome assemblies with different basecalling models, at different simulated depths of coverage. ONT assemblies are compared to the Illumina reference for completeness and core genome multilocus sequence typing (cgMLST) accuracy (number of allelic mismatches). Our results show that genomes obtained from raw ONT data basecalled with Dorado SUP v0.9.0, assembled with Flye, and with a minimum coverage depth of 35×, optimized accuracy for all bacterial species tested. Error rates are consistently <0.5% for each cgMLST scheme, indicating that ONT R10.4.1 data are suitable for high-resolution genomic typing applied to outbreak investigations and public health surveillance.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at https://www.genome.org/cgi/doi/10.1101/gr.279829.124.
- Received October 3, 2024.
- Accepted May 28, 2025.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.