Bioinformatics Seminar
Time: 11AM
Venue: Davis Auditorium and Online
26 March 2024
Using nanopore-sequenced 16S and 23S rRNA genes for identification and quantification of bacterial microbiota
Chris WoodruffWEHI Bioinformatics
The feasibility of achieving near-strain taxonomic resolution and reliable microbiota strain-level abundance estimation using amplicon-based nanopore sequencing has been investigated. Denoising was applied to separate 16S and 23S genes extracted from a metagenomic dataset generated by Sereika et al. (2023) using nanopore sequencing. Denoising used Kumar et al. (2023) Robust Amplicon Denoising (RAD) and generated amplicon sequence variants (ASVs). The Sereika et al. data set was generated from the Zymo [D6322 7-bacterial species mock microbiome. A sub-sampling procedure generated additional datasets that allowed sensitivity assessment over multiple orders of relative abundance. Alignment to bespoke 16S and 23S rRNA gene databases provided both identification and abundance information data for sub-species analyses. Sub-species identification was clearly achieved, both with the even D6322 dataset, and with the 4 sub-sampled datasets in which 3-orders of magnitude relative abundances were present. Multiple ASVs of length approximately 2500 bases were generated many of which gave perfect alignments to reference 23S rRNA genes. Similarly for the approximately 1500 base long 16S rRNA genes. Despite strain ambiguity for some species, the strain of each species known to be present was identified in all but 1 (of 70) cases examined for all species. A process to merge the 16S and 23S rRNA genes results in reduced ambiguity, allowing better sub-species resolution, and approximate strain relative cellular abundance estimates. Principled methods for dealing with strain-level ambiguity in microbiota analysis have been developed that are generally applicable. Their application to nanopore sequencing with the current state of Oxford Nanopore Technology, together with state-of-the-art denoising, allows near strain resolution of bacterial microbiota identity, and good estimates of species, and sub-species, relative abundances.