Benchmarking assembly free nanopore read mappers to classify complex millipede gut microbiota via Oxford Nanopore Sequencing Technology

Main Article Content

Orlando J. Geli-Cruz
Carlos J. Santos-Flores
Matías J. Cafaro
Alexander J. Ropelewski
Alex Van Dam


Metagenomics, Myriapoda, Nanopore sequencing, Gut microbiota, DNA extraction


Millipedes are key players in recycling leaf litter into soil in tropical ecosystems. To elucidate their gut microbiota, we collected millipedes from different municipalities of Puerto Rico. Here we aim to benchmark which method is best for metagenomic skimming of this highly complex millipede microbiome. We sequenced the gut DNA with Oxford Nanopore Technologies’ (ONT) MinION sequencer, then analyzed the data using MEGAN-LR, Kraken2 protein mode, Kraken2 nucleotide mode, GraphMap, and Minimap2 to classify these long ONT reads. From our two samples, we obtained a total of 87,110 and 99,749 ONT reads, respectively. Kraken2 nucleotide mode classified the most reads compared to all other methods at the phylum and class taxonomic level, classifying 75% of the reads in the two samples, the other methods failed to assign enough reads to either phylum or class to yield asymptotes in the taxa rarefaction curves indicating that they required more sequencing depth to fully classify this community. The community is hyper diverse with all methods classifying 20‒50 phyla in the two samples. There was significant overlap in the reads used and phyla classified between the five methods benchmarked. Our results suggest that Kraken2 nucleotide mode is the most appropriate tool for the application of metagenomic skimming of this highly complex community.


Download data is not yet available.


Metrics Loading ...
Abstract 31 | HTML Downloads 24 PDF Downloads 263 Supplementary Downloads 0


1. Rachtman E, Balaban M, Bafna V, Mirarab S. The impact of contaminants on the accuracy of genome skimming and the effectiveness of exclusion read filters. Mol Ecol Resour. 2020 May;20(3). Epub 2020 Feb 4. PMID: 31943790.
2. Tedersoo L, Tooming-Klunderud A, Anslan S. PacBio metabarcoding of Fungi and other eukaryotes: errors, biases and perspectives. New Phytol. 2018 Feb 1;217(3):1370–85.
3. Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014 Mar 3;15(3):R46.
4. Sović I, Šikić M, Wilm A, Fenlon SN, Chen S, Nagarajan N. Fast and sensitive mapping of nanopore sequencing reads with GraphMap. Nat Commun. 2016 Sep 15;7(1):11307.
5. Jain C, Dilthey A, Koren S, Aluru S, Phillippy AM. A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases. J Comput Biol. 2018 Jul 1;25(7):766–79.
6. Li H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics. 2016 Jul 15;32(14):2103–10.
7. Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–100.
8. Huson DH, Albrecht B, Ba ˘ Gcı C, Bessarab I, Górska A, Jolic D, et al. MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs. Biol Direct. 2018;13:6.
9. Vélez M. Los Gongolies, Gungulenes o Milpiés (Clase Diplopoda). In: In Joglar, R, Santos-Flores, C & Torres-Pérez, J Biodiversidad de Puerto Rico: Invertebrados. 2014. p. 240–53.
10. Oxford Nanopore Technologies. New basecaller now performs `raw basecalling’, for improved sequencing accuracy [Internet]. 2017 [cited 2017 Sep 20]. Available from:
11. Oxford Nanopore Technologies. Albacore basecaller from Oxford Nanopore. Available from: Accessed June 7, 2023.
12. De Coster W, D’Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Berger B, editor. Bioinformatics. 2018 Aug 1;34(15):2666–9.
13. Schultz D. Pauvre: QC and genome browser plotting Oxford Nanopore and PacBio long reads.. Accessed Nov 11, 2018.
14. Kiełbasa S, Wan R, Sato K, Horton P, Frith M. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011;21(3):487–93.
15. Firth MC. last-rna/ at master · mcfrith/last-rna · GitHub [Internet]. [cited 2019 Nov 11]. Available from: Accessed June 7, 2023.
16. Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019;20:257. Accessed June 7, 2023.
17. Cuscó A, Catozzi C, Viñes J, Sanchez A, Francino O. Microbiota profiling with long amplicons using Nanopore sequencing: full-length 16S rRNA gene and whole rrn operon. F1000Research. 2018;7:1755.
18. Brown BL, Watson M, Minot SS, Rivera MC, Franklin RB. MinION ™ nanopore sequencing of environmental metagenomes: A synthetic approach. GigaScience. 2017 Mar 1;6(3):1-10. PMID: 28327976; PMCID: PMC5467020.
19. Hernandez RJJ, Virella CR, Cafaro MJ. First survey of arthropod gut fungi and associates from Vieques , Puerto Rico. Mycologia. 2009;101(6):103–896.
20. Kenny NJ, Shen X, Chan TTH, Wong NWY, Chan TF, Chu KH, et al. Genome of the Rusty Millipede, Trigoniulus corallinus, Illuminates Diplopod, Myriapod, and Arthropod Evolution. Genome Biol Evol. 2015 Apr 21;7(5):1280-95. PMID: 25900922; PMCID: PMC4453065.
21. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008 May 21;18(5):821–9.
22. Harris TW, Arnaboldi V, Cain S, Chan J, Chen WJ, Cho J, et al. WormBase: a modern Model Organism Information Resource. Nucleic Acids Res. 2020 Jan 8;48(D1):D762-D767. PMID: 31642470; PMCID: PMC7145598.
23. Morgulis A, Gertz EM, Schäffer AA, Agarwala R. A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J Comput Biol. 2006 Jun;13(5):1028–40.
24. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.
25. Bardou P, Mariette J, Escudié F, Djemiel C, Klopp C. Jvenn: An interactive Venn diagram viewer. BMC Bioinformatics. 2014 Aug 29;15(1):293.
26. Hsieh TC, Ma KH, Chao A. iNEXT: an R package for rarefaction and extrapolation of species diversity (Hill numbers). McInerny G, editor. Methods Ecol Evol. 2016 Dec 6;7(12):1451–6.
27. Chao A, Ma KH, Hsieh TC, Chiu CH. Online Program SpadeR (Species-richness Prediction And Diversity Estimation in R). 2015. p.
28. Magurran AE. Measuring Biological Diversity. Oxford: Blackwell Science Ltd; 2004. 256 p.
29. Sardar P, Šustr V, Chroňáková A, Lorenc F, Faktorová L. De novo metatranscriptomic exploration of gene function in the millipede holobiont. Sci Rep. 2022 Sep 28;12(1):16173.