Next-Generation Sequencing for Disease Biomarkers

Next-generation sequencing (NGS) is beginning to provide powerful insights into genetic mutations and molecular networks underlying disease for a broad set of genes. NGS technologies promise to revolutionize a number of areas in clinical practice, including the validation and analysis of sequencing-based biomarkers, a tool increasingly relevant for diagnosis in multiple disease areas and for the selection and monitoring of therapeutic treatments. The recent emergence of positional sequencing is enabling access to better, more accurate sequence information. This article will overview the strength of this technology and future implications in the field of molecular biomarkers.

Next-generation sequencing: historic context

The traditional methods for sequencing DNA, first developed in the late 1970s, were based on chemical degradation or the use of inhibitor biochemical synthesis followed by readout through radioactive DNA or, later, light-based detection. The introduction in 1998 of automated capillary sequencers led to dramatically improved throughput, making it possible to embark on the sequencing of the first entire human genome, which was completed in 2001. Since then, the need for higher sequencing throughput at reduced costs, combined with development in bioinformatics and microfabrication, have driven the development of novel sequencing technologies. These are collectively called next-generation sequencing, and are capable of sequencing in parallel large amounts of DNA templates. While the Human Genome Project involved the collaboration of over 20 laboratories for 13 years at a cost of roughly $3 billion, NGS now allows researchers to sequence, in a single run, more than five human genomes in a matter of days, for less than $5000 per genome in reagents.

NGS is a rapidly evolving field, offering a wide spectrum of platforms. While no single system provides the optimal attributes for all applications, collectively it offers complementary approaches to sequence analysis and ample room for the emergence of new technologies. The initial, second-generation sequencers (2G), which rely on clonal amplification of DNA templates, carry tradeoffs, i.e., high-throughput, shorter read lengths; amplification bias; and increased error rates. To address these limitations, third-generation (3G) systems were introduced in 2008, employing single-molecule templates and cycle-free chemistry.

More recently, fourth-generation (4G) sequencers incorporating nanopore technologies have been developed. They have the potential to generate much longer reads; do not require expensive reagents; and employ electronic detection, a much less expensive alternative to lightbased detection.

Positional sequencing technology, developed by Brown University’s spin-off, Nabsys (Providence, RI), is the most recent innovation in the field. It promises to deliver accurate measurements of position between predictable DNA landmarks, and offers broad applicability in DNA analysis, from genome sequence assembly and finishing to sequence-based diagnostics and biomarkers. Each detector module can run samples independently or simultaneously with up to eight modules per instrument.

Beyond the Human Genome Project—clinical applications of NGS

Now that routine sequencing of whole exomes or genomes from patients is affordable for many academic institutions, there is an increasing realization that the biology underlying disease is not dependent on the genome sequence alone. A complete clinical perspective continues to rely on other types of testing, based on imaging or on the analysis of molecular biomarkers. But herein is an additional strength of NGS platforms, which offer versatility to examine a number of cellular properties beyond determination of genomic sequence alone. The range of additional applications is very broad, including transcriptomic profiling, analysis of alternative spliced forms (and hence encoded proteins) and profiling of small RNA populations. For example, NGS can also generate information on DNA methylation patterns across the entire genome and gather epigenetic information from multiple genes, which is extremely valuable in the prognosis of tumors through methylation status. NGS is optimally positioned to move the field of personalized medicine beyond mere data acquisition to the identification of reliable biomarkers.

Biomarkers offer great promise for improving prevention and treatment of complex, common diseases like cancer, cardiovascular disease and diabetes. Disease-related biomarkers can indicate the presence of disease (diagnostic biomarkers); the probable effect of a given treatment (predictive biomarkers); or how a disease may develop in an individual case, regardless of the type of treatment (prognostic biomarkers). In addition, drug-related biomarkers may serve as indicators of the effectiveness of certain drugs in a specific type of patient. Up to now, the clinical applicability of basic and translational research in molecular biomarkers has been relatively modest because of the genetic variation among patients with the same disease.

The revolution brought about by NGS has the potential to identify a broader range of genetic differences and support the generation of more robust biomarkers in medicine. Before the advent of NGS, diverse strategies had been utilized to discover clinically useful molecular biomarkers, all of which encountered limitations. Single-gene approaches have resulted in the development of some markers with diagnostic value, but are often inadequate for the screening of the general population or for accurate prediction of individual risk in the context of complex multifactorial diseases like cancer.

More sophisticated, multigene strategies based on gene expression profiling using microarrays and real-time PCR techniques have been used successfully in some cases, such as the treatment of breast cancer. This has happened through the development of a molecular classification of the disease, and two assays for the identification of node-free early breast cancer, the 21-gene and 70-gene signatures.

However, gene-expression profiling alone has limited predictive power. In the prevention setting, genome-wide association studies (GWAS) for the identification of genetic variants linked to disease risk have also met with limitations. The approach, based on the use of microchips containing probes for tens of thousands of genetic variants, in conjunction with large databases, has led to the identification of numerous variants associated with diseases. However, these variants fall into just two categories: 1) mutations associated with high risk of disease, but are extremely rare in the population and therefore with the potential to benefi t very few individuals, and 2) more common mutations that are associated with very small increase in disease risk, and are therefore of little clinical utility.

NGS and long-read technology for biomarker research

NGS offers a novel approach to overcome some of these limitations. The technology provides a high degree of flexibility and can be tailored to reach the level of resolution required for any given experiment, which is ideal in biomarker research. Assay conditions can be tuned to generate more or less data, to zoom in on defined regions of the genome at high resolution, or to provide more extensive data with lower resolution. For instance, cancer biomarkers are often based on somatic mutations present in just a small proportion of cells in clinical samples. In this case, the genome region containing the mutation must be sequenced at high levels of coverage, above 1000×, which can only be done in a cost-effective way by NGS technology. At the other end of the resolution spectrum, such as discovery of genome-wide variants, it is more useful to sequence at lower resolution, while processing a large number of patient samples to achieve greater statistical power.

Figure 1 – Semiconductor-based detector chip.

Long-read NGS is enabling the generation of more accurate and extensive datasets, providing information about the location, size, and orientation of large-scale structural variants. Nabsys has developed the first solid-state measurement of individual DNA molecules. The technology uses highly scalable semiconductor chips. Nabsys nano-channel detectors are capable of analyzing DNA at greater than 1M bases per second per detector (see Figure 1). These are similar to those found on commercial electronic devices, except that DNA molecules physically flow through the chip and are read electronically as they pass through chip nano-channels at rates above one million bases per second, per channel. The electronic detection system identifies sequence tags placed at different intervals, to generate long-range information for assembling or mapping genomes.

Figure 2 – NPS8000 cart with monitor.

Nabsys technology delivers higher resolution, accuracy and very rapid time to results. The DNA sequence reads are also much longer, allowing the generation of long-range information in a cost-effective way. In addition, because the detection is done electronically, the technology is inherently much less expensive than standard sequencing methods. The Nabsys NPS8000 enables rapid, easily automated workflow (see Figure 2).

Conclusion

Biomarkers are an increasingly important tool in clinical practice. Despite extensive efforts using traditional single- or multigene molecular research or preliminary GWAS, clinical applications have been relatively modest. This is due to factors such as current limited knowledge of the staggering range of somatic mutations associated with disease, or the diversity of genetic variation present in the populations of patients with a given disease. NGS promises to overcome some of these challenges and is invigorating biomarker research on various fronts, from the cataloguing of key mutations driving common complex diseases like cancer to the elucidation of molecular networks underlying disease. In addition to mutations or genetic variants driving disease, different biomarkers associated with therapeutic interventions (efficacy, safety, resistance) are expected to be tested using NGS in the near future. The cost of NGS-based assays is rapidly declining to be competitive with currently available multigene panels.

In particular, long-read NGS technology is capable of providing information over length scales large enough to reveal genome structural variants. The Nabsys positional sequencing platform offers an approach capable of generating information with unprecedented accuracy, speed, throughput and scalability, all with extremely low data burden and cost. Because the method involves analysis of single DNA molecules, it enables discrimination of multiple genome variants that may be present in a single sample containing a heterogeneous cell population. The technology is currently being used for genome mapping and assembly or structural variant detection, applications that are key in the field of biomarker research. As the throughput increases, additional applications that are difficult to carry out with current technologies will also be feasible, including mapping of heterogeneous tumor samples. While the unique challenges inherent to molecular biomarker research will require a few years to overcome, improvements are occurring rapidly, and Nabsys is well poised to keep pace with these new developments.

Darren Lee is Vice President, Business Development, Nabsys, Inc., 60 Clifford St., Providence, RI 02903, U.S.A.; tel.: 401-276-9100; e-mail: [email protected]www.nabsys.com

Related Products

Comments