Generation of Sequencing Technology
Generation of Sequencing Technology
og
cin
l
Bio
Abstract
DNA sequencing process utilizes biochemical methods in order to determine the correct order of nucleotide
bases in a DNA macromolecule using sequencing machines. Ten years ago, sequencing was based on a single type
of sequencing that is Sanger sequencing. In 2005, Next Generation Sequencing Technologies emerged and
changed the view of the analysis and understanding of living beings. Over the last decade, considerable progress
has been made on new sequencing machines. In this paper, we present a non-exhaustive overview of the
sequencing technologies by beginning with the first methods history used by the commonly used NGS platforms
until today. Our goal is to provide beginners in the field as well as to the amateurs of science a simple and
understandable description of NGS technologies in order to provide them with basic knowledge as an initiation into
this field in full ardor.
Keywords: Next generation sequencing technologies; DNA; first sequencing technologies. These new sequencing technologies are
Sequencing; Long reads; Short reads generally known under the name of “Next Generation Sequencing
(NGS) Technologies” or “High Throughput Sequencing Technologies”.
Introduction NGS technologies produce a massively parallel analysis with a high-
The discovery of the double helix structure composed of four throughput from multiple samples at much reduced cost [8]. NGS
Deoxyribonucleic Acid (DNA) bases {A, T, C, G} by Watson JD et al. in technologies can be sequenced in parallel millions to billions of reads
1953 [1] has led to the decoding of genomic sequences and know DNA in a single run and the time required to generate the GigaBase sized
composition of organisms. The DNA sequencing is the discovery that reads is only a few days or hours making it best than the first
uses the DNA composition to understand and decrypt the code to all generation sequencing such as Sanger sequencing. The human
biological life on earth as well as to understand and treat genetic genome, for example, consists of 3 billion bps and is made up of DNA
diseases [2]. macromolecules of lengths varying from 33 to ~247 million bps,
distributed in the 23 chromosomes located in each human cell nucleus,
The appearance of sequencing technologies has played an important the sequencing of the human genome using the Sanger sequencing
role in the analysis of genomic sequences of organisms. A DNA took almost 15 years, required the cooperation of many laboratories
sequencer produces files containing DNA sequences [3]. These around the world and costed approximately 100 million US dollars,
sequences are strings called reads on an alphabet formed by five letters whereas the sequencing by NGS sequencers using the 454 Genome
{A, T, C, G, N}. The symbol N is used to represent an ambiguity. The Sequencer FLX took two months and for approximately one hundredth
first sequencing technologies were developed in 1977 by Sanger et al. of the cost [9]. Unfortunately, NGS are incapable to read the complete
[4] from Cambridge University awarded a Nobel Prize in chemistry in DNA sequence of the genome, they are limited to sequence small DNA
1980 and Maxam et al. [5] from Harvard University. Their discovery fragments and generate millions of reads. This limit remains a negative
opened the door to study the genetic code of living beings and brought point especially for genome assembly projects because it requires high
their inspiration to researchers to the development of faster and computing resources.
efficient sequencing technology. Sanger sequencing has become the
most applied technique of sequencing for its high efficiency and low NGS technologies continue to improve and the number of
radioactivity [6] and has been commercialized and automated as the sequencers increases these last years. However, the literature divided
"Sanger Sequencing Technology". NGS technologies into two types [3,10]. We distinguish the second
generation sequencing technologies which refer to the newest
Sanger and Maxam-Gilbert sequencing technologies were the most sequencing technologies developed in the NGS environment after the
common sequencing technologies used by biologists until the first generation [11,12], they are characterized by the need to prepare
emergence of a new era of sequencing technologies opening new amplified sequencing banks before starting the sequencing of amplified
perspectives for genomes exploration and analysis. These sequencing DNA clones [13] and there are the third generation sequencing
technologies were firstly appeared by Roche’s 454 technology in 2005 technologies that are sequencing technologies recently appeared [6], in
[7] and were commercialized as technologies capable of producing contrast to the second generation, these technologies are classified as
sequences with very high throughput and at much lower cost than the Single Molecule Sequencing Technology [14] because they can make
Page 2 of 8
sequencing a single molecule without the necessity to create the platform [10,11,13,15-23]. In the following, we present a brief review
amplification libraries and that are capable of generating longer reads of the three existing generations of sequencing technologies (the first,
at much lower costs and in a shorter time. second and third). We focus on sequencing methods and platforms
characterizing each generation of sequencing (Table 1).
Several previous reports and studies presented the sequencing
technologies and detailed chemical mechanisms of each sequencing
First Generation
Second Generation
GS FLX
454 1M 700 SE, PE indel 1 2011
Titanium+ 0.7
Illumina MiniSeq 25M (maximum) 150 SE, PE mismatch 1 7.5 (maximum) 2013
Illumina MiSeq 25M (maximum) 300 SE, PE mismatch 0.1 15 (maximum) 2011
Illumina NextSeq 400M (maximum) 150 SE, PE mismatch 1 120 (maximum) 2014
Illumina HiSeq 5B (maximum) 150 SE, PE mismatch 0.1 1.5Tb (maximum) 2012
Illumina HiSeq X 6B (maximum) 150 SE, PE mismatch 0.1 1.8Tb (maximum) 2014
Ion Torrent PGM 314 chip v2 400.000-550.000 400 SE indel 1 0.06 to 0.1 2011
Ion Torrent PGM 318 chip v2 4M - 5.5M 400 SE indel 1 1.2 to 2 2013
Ion Torrent Ion S5/S5XL 540 60M - 80M 400 SE indel 1 NA 2015
Third Generation
Page 3 of 8
*depending on run module; NA: Not available; SE: Single End; PE: Paired End; M: Million; B: Billion; Gb: Gigabytes; Tb: Terabytes
The First Generation of Sequencing favored the latter to the Maxam-Gilbert sequencing method, and it is
also considered dangerous because it uses toxic and radioactive
Sanger and Maxam-Gilbert sequencing technologies were classified chemicals.
as the First Generation Sequencing Technology [10,16] who initiated
the field of DNA sequencing with their publication in 1977.
Sanger sequencing
Sanger Sequencing is known as the chain termination method or
the dideoxynucleotide method or the sequencing by synthesis method.
It consists in using one strand of the double stranded DNA as template
to be sequenced. This sequencing is made using chemically modified
nucleotides called dideoxy-nucleotides (dNTPs). These dNTPs are
marked for each DNA bases by ddG, ddA, ddT, and ddC. The dideoxy-
nucleotides are used dNTPs are used for elongation of nucleotide, once
incorporated into the DNA strand they prevent the further elongation
and the elongation is complete. Then, we obtain DNA fragments ended
by a dNTP with different sizes. The fragments are separated according
to their size using gel slab where the resultant bands corresponding to
DNA fragments can be visualized by an imaging system (X-ray or UV
light) [24,25]. Figure 1 details the Sanger sequencing technology.
The first genomes sequenced by the Sanger sequencing are phiX174
genome with size of 5374 bp [26] and in 1980 the bacteriophage λ
genome with length of 48501 bp [27]. After years of improvement,
Applied Biosystems is the first company that has automated Sanger
sequencing. Applied Biosystems has built in 1995 an automatic
sequencing machine called ABI Prism 370 based on capillary
electrophoresis allowing fast an accurate sequencing. The Sanger Figure 1: Sanger sequencing technology. (a) The sequencing
sequencing was used in several sequencing projects of different plant reaction is performed by the presence of denatured DNA template,
species such as Arabidopsis [28], rice [29] and soybean [30] and the radioactively labeled primer, DNA polymerase, and dNTPs. The
most emblematic achievement of this sequencing technology is the DNA polymerase is used to incorporate the dNTPs into the
decoding of the first human genome [31]. elongating DNA strand. Each of the four dNTPs is run in a separate
The sanger sequencing was widely used for three decades and even reaction so the polymerization can randomly terminate at each base
today for single or low-throughput DNA sequencing, however, it is position. The end result of each reaction is a population of DNA
difficult to further improve the speed of analysis that does not allow fragments with different lengths, with the length of each fragment
the sequencing of complex genomes such as the plant species genomes dependent on where the dNTPs is incorporated. (b) Illustrates the
and the sequencing was still extremely expensive and time consuming. separation of these DNA fragments in a denaturing gel by
electrophoresis. The radioactive labeling on the primer enables
visualization of the fragments as bands on the gel. The bands on the
Maxam-Gilbert sequencing gel represent the respective fragments shown to the right. The
Maxam-Gilbert is another sequencing belonging to the first complement of the original template (read from bottom to top) is
generation of sequencing known as the chemical degradation method. given on the left margin of the sequencing gel. (From P Moran,
Relies on the cleaving of nucleotides by chemicals and is most effective Overview of commonly used DNA techniques, in LK Park, P
with small nucleotides polymers. Chemical treatment generates breaks Moran, and RS Waples, eds., Application of DNA Technology to the
at a small proportion of one or two of the four nucleotide bases in each Management of Pacific Salmon, 1994, 15–26, Department of
of the four reactions (C, T+C, G, A+G). This reaction leads to a series Commerce, NOAA Technical Memorandum NMFS-NWFSC-17. ©
of marked fragments that can be separated according to their size by Paul Moran, NOAA’s Northwest Fisheries Science Center. With
electrophoresis [5,24]. permission).
The sequencing here is performed without DNA cloning. However,
the development and improvement of the Sanger sequencing method
Page 4 of 8
Roche/454 sequencing
Roche/454 sequencing appeared on the market in 2005, using
pyrosequencing technique which is based on the detection of
pyrophosphate released after each nucleotide incorporation in the new Figure 2: Roche/454 sequencing technology [39].
synthetic DNA strand (http://www.454.com). The pyrosequencing
technique is a sequencing-by-synthesis approach.
DNA samples are randomly fragmented and each fragment is Ion torrent sequencing
attached to a bead whose surface carries primers that have
oligonucleotides complementary to the DNA fragments so each bead is Life Technologies commercialized the Ion Torrent semiconductor
associated with a single fragment (Figure 2A). Then, each bead is sequencing technology in 2010 (https://www.thermofisher.com/us/en/
isolated and amplified using PCR emulsion which produces about one home/brands/ion-torrent.html). It is similar to 454 pyrosequencing
million copies of each DNA fragment on the surface of the bead technology but it does not use fluorescent labeled nucleotides like
(Figure 2B). The beads are then transferred to a plate containing many other second-generation technologies. It is based on the detection of
wells called picotiter plate (PTP) and the pyrosequencing technique is the hydrogen ion released during the sequencing process [35].
applied which consists in activating of a series of downstream reactions Specifically, Ion Torrent uses a chip that contains a set of micro wells
producing light at each incorporation of nucleotide. By detecting the and each has a bead with several identical fragments. The
light emission after each incorporation of nucleotide, the sequence of incorporation of each nucleotide with a fragment in the pearl, a
the DNA fragment is deduced (Figure 2C) [15]. The use of the picotiter hydrogen ion is released which change the pH of the solution. This
plate allows hundreds of thousands of reactions occur in parallel, change is detected by a sensor attached to the bottom of the micro well
considerably increasing sequencing throughput [14]. The latest and converted into a voltage signal which is proportional to the
instrument launched by Roche/454 called GS FLX+ that generates number of nucleotides incorporated (Figure 3).
reads with lengths of up to 1000 bp and can produce ~1Million reads
per run (454.com GS FLX+Systems http://454.com/products/gs-flx- The Ion Torrent sequencers are capable of producing reads lengths
system/index.asp). Other characteristics of Roche/454 instruments are of 200 bp, 400 bp and 600 bp with throughput that can reach 10 Gb for
listed in [16,25]. ion proton sequencer. The major advantages of this sequencing
technology are focused on read lengths which are longer to other SGS
The Rche/454 is able to generate relatively long reads which are sequencers and fast sequencing time between 2 and 8 hours. The major
easier to map to a reference genome. The main errors detected of disadvantage is the difficulty of interpreting the homopolymer
sequencing are insertions and deletions due to the presence of sequences (more than 6 bp) [21,36] which causes insertion and
homopolymer regions [33,34]. Indeed, the identification of the size of deletion (indel) error with a rate about ~1%.
homopolymers should be determined by the intensity of the light
emitted by pyrosequencing. Signals with too high or too low intensity
lead to under or overestimation of the number of nucleotides which
causes errors of nucleotides identification.
Page 5 of 8
Illumina/Solexa sequencing
The Solexa company has developed a new method of sequencing.
Illumina company (http://www.illumina.com) purchased Solexa that
started to commercialize the sequencer Ilumina/Solexa Genome
Analyzer (GA) [3,37]. Illumina technology is sequencing by synthesis
approach and is currently the most used technology in the NGS
market.
The sequencing process is shown in Figure 4. During the first step,
the DNA samples are randomly fragmented into sequences and
adapters are ligated to both ends of each sequence. Then, these
adapters are fixed themselves to the respective complementary
Figure 4: Illumina sequencing technology [39].
adapters, the latter are hooked on a slide with many variants of
adapters (complementary) placed on a solid plate (Figure 4A). During
the second step, each attached sequence to the solid plate is amplified
by “PCR bridge amplification” that creates several identical copies of ABI/SOLiD sequencing
each sequence; a set of sequences made from the same original
sequence is called a cluster. Each cluster contains approximately one Supported Oligonucleotide Ligation and Detection (SOLiD) is a
million copies of the same original sequence (Figure 4B). The last step NGS sequencer Marketed by Life Technologies (http://
is to determine each nucleotide in the sequences, Illumina uses the www.lifetechnologies.com). In 2007, Applied Biosystems (ABI) has
sequencing by synthesis approach that employs reversible terminators acquired SOLiD and developed ABI/SOLID sequencing technology
[38] in which the four modified nucleotides, sequencing primers and that adopts by ligation (SBL) approach [3].
DNA polymerases are added as a mix, and the primers are hybridized The ABI/SOLiD process consists of multiple sequencing rounds. It
to the sequences. Then, polymerases are used to extend the primers starts by attaching adapters to the DNA fragments, fixed on beads and
using the modified nucleotides. Each type of nucleotide is labeled with cloned by PCR emulsion. These beads are then placed on a glass slide
a fluorescent specific in order for each type to be unique. The and the 8-mer with a fluorescent label at the end are sequentially
nucleotides have an inactive 3’-hydroxyl group which ensures that only ligated to DNA fragments, and the color emitted by the label is
one nucleotide is incorporated. Clusters are excited by laser for recorded (Figure 5A). Then, the output format is color space which is
emitting a light signal specific to each nucleotide, which will be the encoded form of the nucleotide where four fluorescent colors are
detected by a coupled-charge device (CCD) camera and Computer used to represent 16 possible combinations of two bases. The
programs will translate these signals into a nucleotide sequence (Figure sequencer repeats this ligation cycle and each cycle the complementary
4C). The process continues with the elimination of the terminator with strand is removed and a new sequencing cycle starts at the position n-1
the fluorescent label and the starting of a new cycle with a new of the template. The cycle is repeated until each base is sequenced twice
incorporation [21,39]. (Figure 5B). The recovered data from the color space can be translated
The first sequencers Illumina/Solexa GA has been able to produce to letters of DNA bases and the sequence of the DNA fragment can be
very short reads ~35 bp and they had an advantage in that they could deduced [15].
produce paired-end (PE) short reads, in which the sequence at both ABI/SOLiD launched the first sequencer that produce short reads
ends of each DNA cluster is recorded. The output data of the last with length 35 bp and output of 3 Gb/run and continued to improve
Illumina sequencers is currently higher than 600 Gpb and lengths of their sequencing which increased the length of reads to 75 bp with an
short reads are about 125 bp. Details on Illumina sequencers [13]. output up to 30 Gb/run [22,23]. The strength of ABI/SOLiD platform
One of the main drawbacks of the Illumina/Solexa platform is the is high accuracy because each base is read twice while the drawback is
high requirement for sample loading control because overloading can the relatively short reads and long run times. The errors of sequencing
result in overlapping clusters and poor sequencing quality. The overall in this technology is due to noise during the ligation cycle which
error rate of this sequencing technology is about 1%. Substitutions of causes error identification of bases. The main type of error is
substitution.
Page 6 of 8
Page 7 of 8
length and is connected by a USB 3.0 port of a laptop computer. This Conclusion
device has been released for testing by a community of users as part of
the MinION Access Program (MAP) to examine the performance of The first method of sequencing came about half a century ago, and
the MinION sequencer [50]. since then, sequencing technologies have continued to evolve
especially after the appearance of the first sequencers from NGS
In this sequencing technology, the first strand of a DNA molecule is technology which appeared in 2005. These technologies are
linked by a hairpin to its complementary strand. The DNA fragment is characterized by their high throughput which gives the opportunity to
passed through a protein nanopore (a nanopore is a nanoscale hole produce millions of reads with inexpensive sequencing. NGS
made of proteins or synthetic materials [39]). When the DNA technologies are now the starting point for several areas of research
fragment is translated through the pore by the action of a motor based on the study and analysis of biological sequences.
protein attached to the pore, it generates a variation of an ionic current
caused by differences in the moving nucleotides occupying the pore In this review, we presented a concise overview of the generations of
(Figure 7A). This variation of ionic current is recorded progressively sequencing technologies by beginning with the first-generation
on a graphic model and then interpreted to identify the sequence sequencing history followed by the main commonly used NGS
(Figure 7B). The sequencing is made on the direct strand generating platforms. Nevertheless, there are significant challenges in NGS
the “template read” and then the hairpin structure is read followed by technologies, including the difficulty of storing and analyzing the data
the inverse strand generating the “complement read”, these reads is generated by these technologies. This is mainly due to the production
called "1D". If the “temple” and “complement” reads are combined, of a high number of reads. In the coming years, new sequencing
then we have a resulting consensus sequence called “two direction platforms will appear producing a larger amount of data (in Terabyte)
read” or "2D" [51,52]. which requires the development of new approaches and applications
capable of analyzing this large amount of data.
Among the advantages offered by this sequencer: first, it’s low cost
and small size. Then, the sample is loaded into a port on the device and
Conflict of Interest
data is displayed on the screen and generated without having to wait
till the run is complete. And, MinION can provide very long reads The authors declare that there is no conflict of interest regarding the
exceeding 150 kbp which can improve the contiguity of the denovo publication of this paper.
assembly. However, MinION produces a high error rate of ~12%
distributed about ~3% mismatchs, ~4% insertions and ~5% deletions References
[53].
1. Watson JD, Crick FH (1953) Molecular structure of nucleic acids: A
The ONT technology has continued to evolve. Recently, a new structure for deoxyribose nucleic acid. Nature 171: 737-738.
instrument has emerged called "PromethION"[54]; it is the bigger 2. Le Tourneau, Christophe, Kamal, Maud (2015) Pan-cancer integrative
brother of the MinION [55]. It is an autonomous worktable sequencer molecular portrait towards a new paradigm in precision medicine.
with 48 individual flow cells each with 3000 pores (equivalent to 48 Springer.
MinIONs) operating at 500 bp [51] per second which is sufficiently 3. Shendure J, Ji H (2008) Next-generation DNA sequencing. Nature
powerful to achieve an ultra-high throughput needed for sequencing Biotechnology 26: 135–1145.
large genomes such as the human genome. Although the PromethION 4. Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-
is not commercially available, the ONT announces that it is capable of terminating inhibitors. Proc Natl Acad Sci 74: 5463–5467.
producing ~2 to 4 Tb for a duration of 2 days and a length of reads 5. Maxam AM, Gilbert WA (1977) A new method for sequencing DNA.
[22] which can attain 200 Kpb which puts this sequencer in Proc Natl Acad Sci 74: 560-564.
competition with the PacBioRSII sequencer from pacific biosciences in 6. Pareek CS, Smoczynski R, Tretyn A (2011) Sequencing technologies and
terms of read length and HiSeq sequencer from Illumina in cost. genome sequencing. J Appl Genet 52: 413–35.
7. Qiang-long Z, Shi L, Peng G, Fei-shi L (2014) High-throughput
sequencing technology and its application. Journal of Northeast
Agricultural University 21: 84-96.
8. Mardis ER (2011) A decade’s perspective on DNA sequencing technology.
Nature 470: 198-203.
9. Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, et al. (2008) The
complete genome of an individual by massively parallel DNA sequencing.
Nature 452: 872-876.
10. Thudi M, Li Y, Jackson SA, May GD, Varshney RK (2012) Current state-
of-art of sequencing technologies for plant genomics research. Brief Funct
Genomics 11: 3-11.
11. Michael LM (2010) Sequencing technologies – the next generation.
Nature Reviews Genetics 11: 31-46.
12. Schatz MC, Delcher AL, Salzberg SL (2010) Assembly of large genomes
using second generation sequencing. Genome Research 20: 1165- 1173.
13. Kulski JK (2016) Next-generation sequencing-An overview of the history,
tools, and “Omic” applications, next generation sequencing-advances,
applications and challenges. InTech.
Figure 7: Oxford nanopore MinION sequencing [21]. 14. Vezzi F (2012) Next generation sequencing revolution challenges: Search,
assemble, and validate genomes. Ph.D, Universita degli Studi di Udine,
Italy.
Page 8 of 8
15. Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev 36. Loman NJ, Misra RV, Dallman TJ, Constantinidou C, Gharbia SE, et al.
Genomics Hum Genet 9: 387-402. (2012) Performance comparison of benchtop high-throughput
16. Liu L, Li Y, Li S, Hu N, He Y, et al. (2012) Comparison of next-generation sequencing platforms. Nature Biotechnol 30: 434-439.
sequencing Systems. Journal of Biomedicine and Biotechnology. 37. Balasubramanian S (2015) Solexa sequencing: Decoding genomes on a
17. Guzvic M (2013) The history of DNA sequencing. J Med Biochem 32: population scale. Clin Chem 61: 21-24.
301-12. 38. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, et al.
18. Hui P (2012) Next generation sequencing: Chemistry, technology and (2008) Accurate whole human genome sequencing using reversible
applications. Top Curr Chem 336: 1-18. terminator chemistry. Nature 456: 53–59.
19. van Dijk EL, Auger H, Jaszczyszyn Y, Thermes C (2014) Ten years of next 39. Heo, Yun (2015) Improving quality of high-throughput sequencing reads.
generation sequencing technology. Trends in Genetics 30: 418-26. 40. Dohm JC, Lottaz C, Borodina T, Himmelbauer H (2008) Substantial
20. Heather JM, Chain B (2015) The sequence of sequencers: The history of biases in ultra-short read data sets from high-throughput DNA
sequencing DNA. Genomics 107: 1-8. sequencing. Nucleic Acids Res 36: e105.
21. Reuter JA, Spacek DV, Snyder MP (2015) High-throughput sequencing 41. Eid J, Fehr A, Gray J (2009) Real-time DNA sequencing from single
technologies. Molecular Cell 58: 586-597. polymerase molecules. Science 323: 133-138.
22. Goodwin S, McPherson JD, Richard McCombie W (2016) Coming of age: 42. Braslavsky I, Hebert B, Kartalov E, Quake SR (2003) Sequence
Ten years of next-generation sequencing technologies. Nature Reviews information can be obtained from single DNA molecules. Proceedings of
Genetics 17: 333–351. the National Academy of Sciences of the USA 100: 3960-3964.
23. Alic AS, Ruzafa D, Dopazo J, Blanquer I (2016) Objective review of de 43. Harris TD, Buzby PR, Babcock H, Beer E, Bowers J, et al. (2008) Single-
novo stand-alone error correction methods for NGS data. In: WIREs molecule DNA sequencing of a viral genome. Science 320: 106-9.
Computational Molecular Science. 44. McCoy RC, Taylor RW, Blauwkamp TA, Kelley JL, Kertesz M, et al. (2014)
24. Masoudi-Nejad A, Narimani Z, Hosseinkhan N (2013) Next generation Illumina TruSeq synthetic long-reads empower de novo assembly and
sequencing and sequence assembly. Methodologies and algorithms. resolve complex, highly-repetitive transposable elements. PLoS ONE 9:
Springer. e106689.
25. El-Metwally S, Ouda OM, Helmy M (2014) Next generation sequencing 45. Rhoads A, Au KF (2015) PacBio sequencing and its applications.
technologies and challenges in sequence assembly. Springer. Genomics, Proteomics & Bioinformatics 13: 178-289.
26. Sanger F, Coulson AR (1975) A rapid method for determining sequences 46. Chin CS, Peluso P, Sedlazeck FJ, Nattestad M, Concepcion GT, et al.
in DNA by primed synthesis with DNA polymerase. J Mol Biol 94: (2016) Phased diploid genome assembly with single molecule real- time
441-448. sequencing. BioRxiv 13: 1050-1054.
27. Sanger F, Coulson AR, Barrell BG, Smith AJ, Roe BA (1980) Cloning in 47. Koren S, Schatz M, Walenz B, Martin J, Howard J, et al. (2012) Hybrid
single stranded bacteriophage as an aid to rapid dna sequencing. J Mol error correction and de novo assembly of single-molecule sequencing
Biol 143: 161-178. reads. Nature Biotechnology 30: 693-700.
28. The Arabidopsis Genome Initiative (2000) Analysis of the genome 48. Mikheyev AS, Tin MMY (2014) A first look at the oxford nanopore
sequence of the flowering plant Arabidopsis thaliana. Nature 408: MinION sequencer. Molecular Ecology Resources 14: 1097–1102.
796-815. 49. Laehnemann D, Borkhardt A, McHardy AC (2015) Denoising DNA deep
29. Goff SA, Ricke D, Lan TH, Presting G, Wang R, et al. (2002) A draft sequencing data—high-throughput sequencing errors and their
sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296: correction. Brief Bioinformatics 17: 154-79.
92-100. 50. Laver T, Harrisona J, O’Neill PA, Moorea K, Farbosa A, et al. (2015)
30. Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, et al. (2010) Genome Studholme. Assessing the performance of the oxford nanopore
sequence of the paleopolyploid soybean. Nature 463: 178-83. technologies MinION. Biomolecular Detection and Quantification 3: 1-8.
31. Durbin RM (2010) A map of human genome variation from population- 51. Lu H, Giordano F, Ning Z (2016) Oxford nanopore MinION sequencing
scale sequencing. Nature 467: 1061-73. and genome assembly. Genomics Proteomics Bioinformatics 14: 265-279.
32. Myllykangas S, Buenrostro J, Ji HP (2012) Overview of sequencing 52. Jain M, Hugh EO, Paten B, Akeson M (2016) The oxford nanopore
technology platforms, bioinformatics for high throughput sequencing. MinION: Delivery of nanopore sequencing to the genomics community.
33. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, et al. (2005) Genome Biology 17: 239.
Genome sequencing in microfabricated high-density picolitre reactors. 53. Ip CL, Loose M, Tyson JR, de Cesare M, Brown BL, et al. (2015) MinION
Nature 437: 376-80. analysis and reference consortium: Phase 1 data release and analysis.
34. Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM (2007) F1000Research 4: 1075.
Accuracy and quality of massively parallel DNA pyrosequencing. 54. Karow J (2014) Oxford Nanopore presents details on new high-
Genome Biol 8: R143. throughput sequencer, improvements to MinIon.
35. Rotheberg JM, Hinz W, Rearrick TM, Schultz J, Mileski W, et al. (2011) 55. http://www.opiniomics.org/doing-the-maths-on-promethion-throughput
An integrated semiconductor device enabling non-optical genome
sequencing. Nature 475: 348-52.