The last decade has seen
huge advances in biological
research. The
advent of genomics
and proteomics, as well as experimental
approaches such as next-generation sequencing and high-throughput screening, have led to
a more in-depth understanding of
systems biology and the discovery
of new biomarkers. Biotherapeutics
innovation is thriving
as research organizations move
beyond small molecules to explore
antibodies, vaccines, siRNA, and
other biologics as potential drug
candidates. Researchers are gaining
new insights every day from
their work with stem cell lines, plasmids,
algae strains, and more. A key factor in the
success of these initiatives is an ability to
“connect the dots,” i.e., have a way to systematically
track biologic research data in
order to compare and analyze experimental
findings; better understand the relationships
between biologic entities; and,
most importantly, make discoveries that
have not been made before. The Accelrys
Biological Registration system (Accelrys,
San Diego, CA) is a tool to help researchers
reach these goals.
How, for example, can an organization
quickly determine whether a promising
siRNA candidate has already been patented?
What is the safety code associated with a specific
biologic entity? Has another department
or project team within the organization conducted
experiments involving the same ribonucleic
acid sequence currently being studied?
Figure 1 - Registration, in this case of biological entities, is a fundamental
process to ensure subsequent data aggregation. The registration number/
ID acts as the key used to track all activity against that entity. This is a
requirement for any overall master data management program.
The problem is that making these types
of critical associations is often far easier
said than done in modern scientific
enterprises, where researchers are challenged
to keep up with enormous volumes
of data. To alleviate these challenges,
speed innovation, and boost
competitive advantage, a reliable and
scaleable system for biological registration
is needed (Figure 1).
Millions of needles; thousands
of haystacks
Biological research data are vast and
extremely complex, often spanning
thousands and even millions of experimental
protocols, proteins, cell lines,
compounds, and more. As research
operations become increasingly global
in scope, these data are typically distributed
across geographies, organizational
departments, and project teams,
and locked within discipline- and
format-specific systems, instruments,
and databases. Adding to the chaos are
data that need to be incorporated from
the literature and public databases such
as GenBank and other government or
academic databases. However, scientists
need to enable researchers to compare
their work against what has already
been done by other organizations and
take advantage of existing knowledge in
their fields.
What all of this means is that accessing
information on just a single biological
entity—in order to find out what
is known about it, which scientists are
working with it, and what processes are
involved in producing it—can be like
finding a needle in a haystack, or, more
accurately, finding a needle that may be
located within hundreds or thousands of
haystacks. Multiply this problem across
the millions of possible entities
a research organization may
want to study, and it is easy to
see the importance of being able
to uniquely identify and track
biologics across data systems and
knowledge sources.
Biological registration:
The next informatics
frontier
Figure 2 - The registration process protects intellectual property
(IP) by establishing a unique ID for the biological entity and
giving the entity a time stamp (Original Electronic File, OEF).
The ID acts as a unique key to track the entity through the screening
database, inventory system, and documents. This key is what
allows the entity and all of its data to be aggregated throughout the
R&D enterprise.
Pharmaceutical organizations
have relied on registration systems
in the chemical realm for
years, using them to identify and
track chemical compounds during
the drug discovery process. Similarly,
a system for biological registration
can help scientists protect their life science
discoveries, keep tabs on experimental
progress, access related information
about promising biologic candidates, and
build on the valuable research that has
come before, both within their own organizations
and across the broader scientific
community (Figure 2).
Improved operational efficiency
It is not unusual for scientists to spend
50% or even 75% of their time searching
through databases, formatting and collating
information for analysis, and comparing
results across departments and disciplines.
This is hugely wasteful. An ability
to quickly find information on biologics
empowers researchers to spend less time
managing data and more time on actual
science. The resulting efficiency gains will
not only speed the discovery process, but
will also save money and resources.
Reduced redundancies
Registration systems can also help scientists
more easily find out if similar research is being undertaken elsewhere in
the organization, and thus reduce
duplicate efforts. This is an issue
that has been exacerbated by the
trend toward distributed global
operations. For example, a project
team located in China may have
already run numerous experiments
on a protein that another group
of scientists located in the U.S. is
also interested in studying. With
a registration system, this existing
knowledge can be reused, avoiding
redundant experimentation
(including its associated costs).
Increased safety
By linking safety codes to a
unique biologic ID, organizations
can more effectively ensure that their
researchers are aware of, and can mitigate,
biohazard risks. Without registering
safety information, organizations can miss
vital information about health and safety
issues associated with the entities they are
working with, especially when research is
handed off to separate teams and specialists
during the course of a project.
IP protection
Imagine being hit with a $1 billion patent
lawsuit after spending years and millions
of dollars bringing a biotherapeutic
to market. These kinds of cases can and
do happen and are more likely to do so
when researchers have no systematic way
of comparing their biologic candidates
with patent databases and other public
sources of information. It is absolutely
critical to be able to identify potential
patent conflicts before investing a great
deal of time and money in R&D, and
equally critical to be able to protect
potentially lucrative IP from competitive
infringement. By tracking the history,
experimental protocols, and processing
steps associated with a biologic entity,
and by identifying similar efforts, registration
offers a cost-effective insurance
policy against such risks.
A foundational, integrated
approach
Biological registration systems are
needed so that research organizations
can track biologics and their relationships,
creating critical intellectual
property positions as well as connections
to past research and manufacturing
processes. Yet, unlike the small molecules
tracked by chemical registration
systems, biological entities such as proteins,
antibodies, vaccines, viruses, or
siRNA are notoriously complex and difficult
to identify in a consistent manner.
For example, if a protein is expressed in
two different cell lines, one scientist
may consider them to be the same thing
because they share the same amino acid
sequence, while another scientist may
consider them to be different because of
different glycosylation patterns. Additionally,
biologics typically comprise
anywhere from hundreds to millions of
atoms, compared to the 20–100 found
in small molecules. Finally, the knowledge
base surrounding biological entities
is continually evolving. Two observations
that are seemingly unrelated
today may lead to an unexpected connection
tomorrow.
Getting biological registration right
requires a foundational approach that
takes into account the complexity inherent
in the field, and one flexible enough
to evolve with the science. The Accelrys
Biological Registration system is an “intelligent”
solution for registering, associating,
searching, and retrieving data for entities
such as siRNA, plasmids, cell lines,
proteins, antibodies, vaccines, and future
biological entities. The system was developed
through a consortium approach that
involved close collaboration with leading pharmaceutical companies, including
Merck & Co., Inc. (Whitehouse Station,
NJ) and Abbott Laboratories (Abbott
Park, IL). The consortium approach was
critical because it enabled Accelrys to
incorporate “in the trenches” insight from
real-world end users about what capabilities
are most important.