Genomic record breaking: Largest animal genome sequenced

By Katie Jones

A team of researchers collaborating across the Research Institute of Molecular Pathology (IMP), and the universities of Vienna, Hamburg, Würzburg, and Konstanz, have successfully sequenced the complete genome of the Australian lungfish (Neoceratodus forsteri).

The lungfish genome is the largest animal genome to ever be deciphered, a record-breaking feat made possible using a novel sequencing technique developed by Oxford Nanopore Technologies. This achievement sheds light on the evolutionary history of the lungfish, a hot topic of debate. The lungfish, often called a living fossil, has been proposed as the closest living relative of the first terrestrial vertebrates, which transitioned from the sea onto land some 380 million years ago.

A DNA molecule is composed of many small units, called bases, which pair up to form its signature double-helix shape. This pioneering nanopore sequencing project, published in the prestigious journal Nature, has revealed the lungfish genome to be 43 billion base pairs long. This is a striking 30% larger than the genome of the axolotl salamander (Ambystoma mexicanum), the former record holder previously sequenced by the same research group. The lungfish’s sizable genome is 14 times larger than the human genome, its extreme size being attributed to large sections of non-coding DNA and abundant regions of repetitive sequence.

Traditional genomic sequencing technology requires sequences of interest to be fragmented into millions of shorter pieces, sequenced individually, and later pieced back together using sophisticated computer algorithms

Analysis of the lungfish genome has provided evidence that this peculiar air-breathing fish is the closest marine relative of four-limbed terrestrial vertebrates. Also, by identifying specific genes associated with the development of limb-like extensions, such as hoxc13 and sall1, this study has given insight into the initial adaptations required for life on land. The researchers further suggest that the lungfish genome’s dynamic nature, responsible for its colossal size, may facilitate such adaptive changes to body structure. Genome dynamism is caused by several factors, including gene duplication and active transposable elements. Active transposable elements, otherwise known as jumping genes, are DNA sequences that can change their position within the genome, altering genetic code and often becoming more prevalent as a result. The expansion of odorant receptor gene families, which encode proteins for gaseous odour detection, and the duplication of genes necessary for lung function, are suggested as two genetic modifications associated with the transition to air-breathing.

Key to this research project’s discoveries was the utilisation of a state-of-the-art nanopore sequencing technique developed by Oxford Nanopore Technologies. Nanopore sequencing eliminates the use of optics traditionally used in base sequence determination, substituting them for an electrical approach. This technology utilises naturally occurring, microbe derived, pore-forming proteins that make nano-scale holes in biological membranes. Sequencing works by monitoring changes in electrical current across these membranes whilst single-stranded RNA or DNA molecules are fed through the pore. The resulting fluctuations in ionic current are recorded and translated into the base sequence of the genetic material in real-time.

Fragments generated in nanopore sequencing possess more significant overlap than traditional short-reads and, through being longer, can span entire repetitive genome regions

Nanopore sequencing technology offers a key advantage over alternative sequencing techniques, specifcally the length of sequence it can read. Traditional genomic sequencing technology requires sequences of interest to be fragmented into millions of shorter pieces, sequenced individually, and later pieced back together using sophisticated computer algorithms. This procedure presents a significant hurdle for sequencing large, repetitive genomes, such as that of the lungfish. Due to this repetition, piecing together the lungfish genome quickly becomes a complicated task, requiring accurate ordering of a large number of near-identical fragments. Imagine piecing together a puzzle consisting only of clear blue sky, but doing so in a specific order.

The ingenious property of ionic nanopore-based sequencing is that the individual fragments of DNA, or contigs, are many base pairs longer than fragments generated by traditional technologies. Contigs are fragments of DNA that overlap to form continuous genetic code sections. Fragments generated in nanopore sequencing possess more significant overlap than traditional short-reads and, through being longer, can span entire repetitive genome regions. Contig length is so fundamental to whole-genome sequence quality that a commonly used quality assessment metric, N50, is calculated based on cumulative and median contig lengths. Longer contig lengths equate to higher quality genome reconstruction. The researchers decoding the lungfish genome exploited the long- and ultra-long-read capabilities of nanopore sequencing to minimise overall sequence fragmentation. The record-breaking sequencing feat was achieved by amalgamating this Oxford Nanopore technology with highly advanced sequence reconstruction algorithms.

Challenges remain when reassembling genomes as lengthy as that of the Australian lungfish. Nevertheless, like assembling a puzzle made up of fewer pieces, longer reads drastically improve the ease and accuracy of genome assembly. Increasing use of nanopore long-read sequencing technology offers enormous potential for genomic sequencing projects. The insights gleaned from this perplexing fish are, much like the emergence of its pre-historic ancestors onto land, only the first-steps in a long journey.

Image from DavidRockDesign on Pixabay.com

  1. Niedringhaus, T. P., Milanova, D., Kerby, M. B., Snyder, M. P. & Barron, A. E. Landscape of next-generation sequencing technologies. Anal. Chem. 83, 4327–4341 (2011)

  2. Meyer, A. et al. Giant lungfish genome elucidates the conquest of land by vertebrates. Nature 590, 284–289 (2021)

  3. Maitra, R. D., Kim, J. & Dunbar, W. B. Recent advances in nanopore sequencing. Electrophoresis 33, 3418–3428 (2012)

  4. Amarasinghe, S. L. et al. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 21, 30 (2020)

  5. Oxford Nanopore Technologies. Nanopore Sequencing: The advantages of long reads for genome assembly. https://nanoporetech.com/sites/default/files/s3/white-papers/WGS_Assembly_white_paper.pdf?submissionGuid=40a7546b-9e51-42e7-bde9-b5ddef3c3512

  6. Research Institute of Molecular Pathology. ‘Record-breaking lungfish genome reveals how vertebrates conquered land. https://www.imp.ac.at/news/article/record-breaking-lungfish-genome/?fbclid=IwAR3GlkwCZLkKQHXz7spfLw2jaHydXvZg453OpkOGCUmaf1DjlkREZLbrEzg

  7. Lu, D. Australian lungfish has largest genome of any animal sequenced so far. New Scientist (2021)
Top