Translational medicine research (TransMed) is a new environment in which groups of stakeholders come together to review increasingly complex data with the goals of improving outcomes, getting more value from health-care spending, and developing precision therapeutics. These new stakeholders include clinicians, patients, academics, and pharmaceutical and diagnostics companies. The data they are using combine patient, ’omics, public domain, and sample sources. The advent of next-generation sequencing (NGS) and proteomics and metabolomics technologies adds to the complexity, and also to the opportunity.
The solution to making this new ecosystem work is not just about how the ’omics are sequenced and analyzed; it is about how all the data are stored and made usable by others. Here is where the modern generation of electronic laboratory notebooks (ELNs) offer not just transactional efficiency, but enable TransMed. The modern ELN is an enterprise data system with configurable user interfaces, analytics, and process control.
We remain “drinking from the firehose” of sequencers, proteomics instruments, and nuclear magnetic resonance (NMR) systems. Although the ability to sequence the genome has been revolutionized by advances in NGS, our ability to deal with the output of those systems has not kept up with the awesome advances in hardware and chemistry. In sequencing laboratories in academia and industry and—more recently—Clinical Laboratory Improvement Amendments (CLIA) labs serving diagnosis and treatment, legacy approaches have largely been used to stem the tide. This has led to a transactional approach to the problem, driven by pragmatism but destined to be overwhelmed.
Sequencing is a game of two halves
Initially, samples are managed through a “wet” process of data generation using instruments and inventories. These may be tracked by laboratory information management systems (LIMS), the workhorse of the QA/QC lab for over 30 years. What follows is a “dry” process of data manipulation and analysis that is conducted using a growing array of algorithms, data compressions, alignments, and annotations. These are the domain of bioinformaticians—the new masters of the universe in TransMed. Raw data are moved through primary, secondary, and tertiary rounds of analysis and annotation accordingly. Variants and other features are highlighted, and the analytical results are visualized.
Many of the algorithms and tools used to do this—such as Bowtie, MATLAB® (MathWorks, Natick, MA), and R® (SAS Institute, Cary, NC)—are ubiquitous, but often they are added to by bespoke or in-house tools whose provenance is hard to identify or regulate. This also means that data generated by unmanaged analytics often cannot be compared. Retrieving data for co-workers therefore often requires database or IT expertise. Each of these represents critical points of failure in a data chain that needs strengthening to cope with the increasing future load.
This process is highly frustrating, even for bioinformaticians. They are scarce resources in today’s research world—smart individuals who have great analytical insights. They should not be spending their time raking through spreadsheets for missing records and typographical errors; they should be orchestrating and interpreting, not cutting and pasting.
For both research and regulated laboratories, while the “wet” process can be established, governed, and audited effectively with legacy tools, the “dry” process often remains highly variable, unaudited, and ungoverned. The same focus on efficiency and informatics (the combination of technology, process, and people) needs to be applied to this “dry” process if we are all to cope with, and benefit from, this new data explosion. In short, we need end-to-end data management in genomics.
Data management in TransMed and genomics: Can ELNs do the job?
Researchers know what their notebook is for. It is where they store hypotheses, a record of what they did, when, and how. It is absolutely critical in active scientific debate and archival when it comes to valuable retrospective research or IP filing.
All of these capabilities are as important to genomics today as they were to Watson, Crick, and Franklin as they developed the hypotheses and data to support the double-helix model. What they did not have is today’s power of computing to lighten the analytical load, store, and enable access to hundreds of analyses and raw data files.
ELN use is growing at over 15% per year1 in sectors of today’s research and development (R&D) industry, notably pharma, nutrition, and chemicals. But can they offer a route to address today’s issues in genomics and TransMed?
It is important to itemize what this environment ideally needs from an ELN. If all you are after is paper replacement in genomics, then there is little benefit. A TransMed ELN must confront the problem by harnessing computing power and software.
First, the ELN needs process execution to govern the “wet” work and (even) drive LIMS to manage samples and generate the data. Second, it needs to enable all researchers to orchestrate and capture the output from bioanalytical work flows written and optimized by bioinformaticians. It also must store sample-centric ’omics data (not just genomics) and patient-centric clinical data in data stores that are built to house them. In addition, the ELN should allow secure yet easy data searching, review, and sharing. Last, it must be able to work in regulated and unregulated environments and provide an audit trail for all data items.
Unless an ELN system has this power and breadth, it will become a legacy approach faster than today’s #1 on the App Store. Fortunately, there are enterprise class systems available that provide robust capability in this area, such as the E-WorkBook (IDBS, Guildford, Surrey, U.K.). IDBS was recently rated “Very High” by Gartner Inc. in TransMed.2 According to the report, a “Very High” rating “means the vendor has excellent scientific domain capabilities and is approaching the ‘de facto’ standard.”
One thing is certain: The volume, variety, and velocity of data in TransMed are on the rise. It is essential to be able to manage those data from sample to insight so that we can ensure the veracity of the data are of the highest quality. Today’s more advanced ELNs do this; if put to work across translational medicine research, every stakeholder in this brave new world will see the benefits.
- Atrium Research: 5th Electronic Laboratory Notebook Survey; http://www.atriumresearch.com/library/A12-02%20Atrium%20Research%205th%20ELN%20Survey%20Brochure.pdf.
- Shanler, M. Manufacturers Must Consider Scientific Domain Expertise during ELN Selection; Jan 11, 2013, Gartner; http://www.gartner.com/ id=2301516.
Chris Molloy is VP Corporate Development, IDBS, 2 Occam Ct., Surrey Research Park, Guildford, Surrey GU2 7QB, U.K.; tel.: +44 1483 595 000; e-mail: CMolloy@idbs.com.