Imagine a scientific enterprise in which instruments are fully enabled for access anywhere at any time, with the proper authorization. As instruments produce data, they are buffered locally but then stored either in an enterprise-level informatics system belonging to an organization such as a large pharmaceutical or chemical company, or in the cloud if the organization does not want to invest in traditional information technology infrastructure. Chromatography data systems (CDS), electronic lab notebooks (ELN), and enterprise content management (ECM) systems will soon become more scaleable and integrated. Architecture will be completely open, and major vendors will work together to create a community that provides more useful applications than individual manufacturers can now furnish. Researchers will be able to collaborate more easily across global work groups, sharing new data instantaneously as data are transformed into increasingly useful formats.
There is great potential for productivity gains in all organizations in the future, regardless of size. The first step toward achieving these gains is for the software and instrument teams in the industry to work together to achieve common standards. Legacy laboratory information management architecture is built around an individual data system for each instrument. These data systems can be networked, but are not designed to access all the benefits of a networked environment. Each data system is its own island. Administration of these systems is largely performed manually, an expensive and time-consuming process.
Good laboratory practices (GLP), regulatory compliance, and security concerns required improvements. These included authorization of system access and control over versions of software being used. It became obvious: Organizations could no longer treat their laboratories as islands, and thus began aligning instrument data systems with their centralized IT programs.
Today, instrument manufacturers like Agilent Technologies (Santa Clara, CA) have developed network-ready, standardized software that can be scaled from the desktop to the work group as well as across the entire enterprise. These systems allow laboratory and IT managers to centralize administration, leverage infrastructure, improve security, facilitate regulatory compliance, reduce costs, and share data across the enterprise. For example, shared services can now deliver benefits such as single log-on, the ability to group projects, and user identity-based permissions. Access is controlled and the systems are secure. Processes can now be distributed across multiple workstations. When advanced reporting tools are coupled to the shared services environment, users can perform and automate cross-sample, cross-batch reporting. Chromatography systems can now be integrated with ECM systems so that there is an underlying archival and retrieval system that makes it possible for organizations to organize, share, retrieve, and thus raise the value of their data.
Today’s systems have improved tremendously over the last 10 years, but gaps remain. One is managing, interpreting, and archiving the mountains of data generated from diverse instruments. Increasingly, complex and large data files (sometimes measured in petabytes, in multiple formats) must be interpreted and archived. This trend shows no signs of slowing. The number of isolated pools of data is also growing, and there are no standards that allow data integration and sharing among different instrument types and brands.
Managers are increasingly seeing value in integrating data from multiple life science disciplines to obtain both scientific and business gains. Predictive analyses driven by systems biology are becoming more commonplace approaches to solving challenges in agriculture and health care. With the growth in systems biology, complex data from multiple, interdisciplinary sources around the globe must be integrated, analyzed, correlated, and visualized, increasing the demand for fast, unconstructed, yet secure access.
Future-facing managers are investing in “collaboration services.” To facilitate collaboration, software vendors must develop and implement standards and open systems. Many collaborative research teams are globally distributed and use multiple unique systems with numerous different functions in various languages and interfaces. The informatics environment of the future must provide a common foundation by which distributed users can integrate their expertise and perspectives in different languages.
Emerging tools: Web services and cloud computing
The advent of Web services and cloud computing is providing economical solutions to many organizations, especially those not investing in large, traditional informatics infrastructure. Web services extend users’ working environments into their community of collaborators, or enable them to broker their services across a larger community of interest, without the overhead of a central IT organization and a large data center.
Cloud computing, a relatively new delivery model for IT and business applications, is made possible by remote computing over the Internet. Users access applications through their Web browser as if they were installed on their own computer, and do not need to have expertise in or control over the infrastructure that supports them. Application software and data are stored on servers managed by the provider. The pharmaceutical industry is at the forefront of this trend.
Cloud computing is particularly well-suited for organizations with smaller, specific requirements, such as academia and small contract research organizations (CROs). They can scale systems to their cost and flexibility needs. Small organizations can use many applications at a lower cost of entry compared with the traditional model. Cloud computing is highly flexible, accommodating any conceivable future needs. Web services also enable software vendors to provide remote support and monitoring to small CROs that may not have their own IT support infrastructure.
There are additional benefits. Users can access their systems from anywhere at any time. This architecture readily accommodates distributing work flows across remote work groups. For example, many experiments are run overnight, and it is not uncommon for technicians to return the following morning to discover problems. Imagine the productivity gain if the analyst receives alerts on his or her smartphone when problems occur. However, like any major emerging technology, cloud computing is raising some concerns. Applications can vary in the level of security supplied. Organizations are asking: Are the servers physically secure? Could my data be hacked?
Natural language processing and semantic Web tools
Two other technologies that promise to facilitate the development of collaborative informatics are natural language processing and semantic Web tools. Natural language processing is a field of computer science focused on converting computer data into human languages and images and vice versa. The goal of semantic Web tools is to make Web-based information understandable by computers, so that computers can perform the tedious work of finding and integrating information across applications and systems.
The need for industry-wide standards
System vendors can help to advance laboratory computing in a number of ways. The development and implementation of industry standards is a critical step. There is a fundamental need for a standard format to enable data integration and collaboration across communities of users in a straightforward and easy way. The current trend is an XML format, but the industry has not standardized on a specific format. It is not clear that this format will work across all of the tools and disciplines that must be integrated in, for example, a systems biology approach.
Although it has been difficult for manufacturers to arrive at a data structure that works across the range of analytical instruments that laboratories use, forward-thinking informatics suppliers are working in this direction. In addition to standards, manufacturers need to provide open instrument control. To do their work, laboratories must use a variety of analytical instruments from a variety of vendors. Laboratories also need standard operating procedures (SOPS) that enable them to work across all instrument types. Manufacturers like Agilent now provide an open instrument control framework (ICF), enabling laboratories to control their instruments from one data system. The open ICF also makes it easier for third-party software developers to develop supporting products or quickly add new instrument control to existing data systems.
Reporting is another opportunity in standardization. Laboratories should not need to use a unique reporting engine for each vendor’s instruments. Today, when the same experiment is run on two different vendors’ systems, laboratory managers have to review two different sets of reports. They waste time determining the differences because of the reporting, versus the differences between the samples. Manufacturers need to provide a single reporting engine that is able to function across instrument types and vendors.
Figure 1 - Priorities of laboratory data stakeholders.
While standards and open systems provide the foundation for collaboration, organizations need systems that also facilitate collaboration, leverage experts across their organization, and help them extract value from huge amounts of data, all while protecting their intellectual property (Figure 1). The Agilent OpenLAB, for example, is an operating system for the laboratory that integrates analytical instrument control and data analysis, enterprise content management, and laboratory business process management into a single, scaleable Web-based system. For collaboration, the OpenLAB electronic laboratory notebook is an open, scaleable, integrated platform for creating, managing, sharing, and protecting data in a complex global environment. It is available in Japanese, Chinese, and Korean, enabling researchers to collaborate easily in their own languages and following local requirements.
Electronic laboratory notebooks
ELNs are gaining popularity throughout the research community, helping organizations better collaborate, access data more easily, and archive knowledge more effectively. However, there is still room for better tools to present data in ways that can be understood and interpreted by a variety of disciplines and languages. Advances in data reduction, visualization, modeling, and simulation are desperately needed to help find true insight among the petabytes of data.
Many organizations utilize a hierarchy of software capability with instrument control and data collection at the bottom. Accelerating decision-making and problem-solving requires a higher level of capability that integrates public and private data from disparate sources to facilitate knowledge-sharing across disciplines and geographies, while protecting intellectual property. Thus, today’s manufacturers now focus product enhancements on data analysis, interpretation, visualization, and reporting. The goal is to help scientific organizations arrive at answers faster and more efficiently than ever before, even across global enterprises.
Linda Doherty, Ph.D. is Senior Product Manager, Enterprise Software Marketing, Agilent Technologies, Inc., 5301 Stevens Creek Blvd., Santa Clara, CA 95052, U.S.A.; 408-553-7504; e-mail: firstname.lastname@example.org.