Sanger sequencing: The gold standard in DNA analysis

0
586

The field of molecular sequencing has evolved significantly, transitioning from conventional PCR techniques to real-time PCR, and now to advanced next-generation sequencing (NGS) platforms. Despite these technological advancements, Sanger sequencing remains the definitive standard for accurately sequencing DNA fragments under 1,000 base pairs (bp), primarily due to its minimal reagent consumption and lower toxicity. 

Originally developed by Frederick Sanger in 1977, this first-generation DNA sequencing technique employs the chain-termination approach, which selectively incorporates dideoxynucleotides to determine nucleotide sequences. This groundbreaking method earned Sanger his second Nobel Prize in Chemistry in 1980 and quickly became a cornerstone in clinical genetics and molecular diagnostics. 

The exceptional accuracy and ability to generate long sequence reads established Sanger sequencing as the foundation for landmark projects, including the Human Genome Project. The technique has continually been refined to meet increasing demands for precision and throughput.

Sanger sequencing is also extensively employed for mitochondrial DNA analysis and short tandem repeat (STR) profiling. Moreover, it facilitates the association of biological evidence with individuals, supports the diagnosis of genetic disorders, and contributes to the investigation of drug metabolism within forensic medicine.

Principle and methodology

Sanger sequencing is based on the principle of selective chain termination. The process begins with an amplified DNA template or complementary DNA (cDNA) hybridized to a specific oligonucleotide primer. DNA polymerase extends the DNA strand by adding either standard deoxynucleotide triphosphates (dNTPs) or chain-terminating dideoxynucleotide triphosphates (ddNTPs). The ddNTPs lack a 3′-hydroxyl group, which prevents the formation of the phosphodiester bond required for strand elongation, thereby terminating DNA synthesis at specific points. 

Traditionally, Sanger sequencing required four separate reactions, each containing a specific ddNTP corresponding to one of the four DNA bases. These parallel reactions generate DNA fragments terminated at each nucleotide, which are then resolved by size using polyacrylamide gel electrophoresis. Upon exposure to X-ray film, the fragments produce a characteristic ladder pattern, read from the smallest fragment at the bottom upward, thereby revealing the DNA sequence.

Advancements in sequencing technology have replaced this approach with the use of fluorescently labeled ddNTPs and capillary electrophoresis, enabling the simultaneous analysis of all four nucleotides (adenine, guanine, cytosine, and thymine) within a single reaction vessel. Each ddNTP is conjugated to a distinct fluorescent dye, facilitating the identification of the terminal base on each DNA fragment. During capillary electrophoresis, these fragments are size-separated, and the emitted fluorescence is captured by charge-coupled device (CCD) detectors. The resulting data are processed by sophisticated software algorithms to accurately reconstruct the nucleotide sequence.

Applications

DNA sequencing remains indispensable in multiple domains due to its unparalleled accuracy as it reveals genetic information essential for life.

Clinical diagnostics

Sanger sequencing is routinely used to validate genetic variants identified through NGS, serving as the reference standard owing to its precision, reproducibility, and long read lengths, typically spanning 700–900 bp, which are optimal for targeted analysis of small genomic regions. 

Its application extends to the characterization of germline mutations implicated in inherited disorders such as cystic fibrosis (CFTR mutations), hereditary cancers (BRCA1/2), and muscular dystrophies. Despite the emergence of high-throughput sequencing, Sanger remains preferred for analyzing single-exon regions and confirming somatic mutations detected in tumor specimens, facilitating diagnosis and therapeutic decision-making.

Genetic research

Sanger sequencing excels in pinpointing single nucleotide variants, small insertions, and deletions within PCR amplicons, making it a preferred method for targeted mutation screening and small-scale genetic investigations. Its role is critical in both foundational research and translational studies focusing on gene function and mutation impact.

Forensic science

The technique’s robustness in analyzing mitochondrial DNA (mtDNA) is pivotal in forensic identification. Mitochondrial sequencing enables the examination of hypervariable regions in degraded or minute samples due to mtDNA’s high copy number per cell. Although NGS approaches are gaining traction, Sanger sequencing continues to provide reliable results for human identification in forensic casework.

Advantages 

  • Exceptional accuracy: Sanger sequencing is fast, accurate, reliable, and achieves over 99.9% raw base accuracy, making it the benchmark for nucleotide identification. It also provides read lengths of up to 1 kilo bp at a relatively low cost.
  • Long read lengths: Capable of producing reads up to 1,000 bp, it surpasses many short-read NGS platforms for single-gene or locus-specific analysis. The original Human Genome Project produced around 30 million reads using Sanger sequencing.
  • Well-established protocols: Decades of standardized use facilitate interlaboratory reproducibility and data validation.
  • Cost-effectiveness: For projects requiring sequencing of limited samples or targeted regions, Sanger remains more economical than whole-genome or large-panel NGS approaches.
  • Compatibility with low-quality samples: It yields reliable results even from low-quality or degraded DNA, including archival clinical specimens and forensic samples.

Limitations 

  • Labor-intensive and time-consuming: The stepwise processing and electrophoresis-based separation limit throughput and increase turnaround time compared with NGS.
  • Fragment length constraints: Sanger sequencing is inefficient for genomic regions exceeding 1,000 bp without laborious fragmentation and assembly.
  • Limited throughput: It cannot parallelize large-scale sequencing and thus is unsuitable for whole-genome or transcriptome studies.
  • Background noise and quantification limits: Although sensitive enough to detect mosaic mutations present in roughly 20% of cells, Sanger sequencing cannot accurately quantify variant allele frequencies, necessitating complementary methods for precise measurements. For instance, differences in mutation levels, such as between 25% and 40%, cannot be accurately determined from peak sizes alone, requiring further testing.

Conclusion

Sanger sequencing has fundamentally transformed molecular genetics by enabling accurate base-by-base determination of DNA sequences. It remains the method of choice for short DNA reads and mutation verification owing to its high accuracy, relatively long read lengths, and robust performance with low-quality samples. Despite rapid advances in high-throughput sequencing technologies, Sanger sequencing retains its status as the gold standard for clinical diagnostics, forensic analyses, and targeted genetic research. Its enduring relevance highlights its integral role in both the historical and contemporary genomics landscape.

 

LEAVE A REPLY

Please enter your comment!
Please enter your name here