 Wayne R. Kubick
|
In clinical development, the gold standard for establishing a therapy's safety and effectiveness is, of course, the controlled
clinical trial. Yet we all know that the body of data accumulated during trials is not sufficient to adequately predict a
product's future performance in the real world.
Clinical trials generally involve only a very small, carefully selected patient population, which is hardly indicative of
the much larger and diverse expanse of patients who may be exposed to a new product after launch.
Thus, the spontaneous reporting system was established to help fill the gap by monitoring potential drug-related events for
marketed products. But despite its undeniable value for pharmacovigilance, it too suffers from such limitations as underreporting,
bias, and uneven data quality. And the spontaneous reporting system can't really compare the number of reports received to
the number of patients who have been exposed to a drug.
Taken together, these two data sources, while necessary and important, are hardly sufficient to fully understand the safety
and effectiveness profile of a new therapy. That's why the promise of using longitudinal health care data for research is
so appealing. While not exactly a new idea—epidemiologists have been conducting formal, protocol-driven, noninterventional studies using
observational health care data for quite some time—it's an idea that has begun to captivate a much broader community of researchers
in government, academia, and industry.
Mounting attention
It seems like everyone is getting interested in using real-life observational patient data for multiple purposes, such as
evaluating drug safety and effectiveness, and improving both the quality and economy of health care treatments.
We're seeing evidence of this mounting interest all around us:
- The FDA Amendments Act of 2007 included establishment of the Reagan Udall Foundation, which among other goals mandates a postmarket
risk identification and analysis system to analyze safety data from at least 1 million patients by July 1, 2012.
- The FDA is initiating or participating in a number of Critical Path Initiatives such as the Sentinel Network and the eHealth
Initiative.
- The public/private Observational Medical Outcomes Partnership (OMOP) is establishing a laboratory to explore the potential
of using such data sources for drug safety evaluation purposes.
- The American Recovery and Reinvestment Act, among its many other features, includes funding to accelerate the adoption of
electronic medical records (which will provide much of this observational data), and an intriguing investment for the "comparative
effectiveness" study of health care treatments.
Data power
Why the recent spike in interest? A primary factor is that more electronic health data are now becoming accessible, and the
increased traction of rapidly improving data standards promise to make data more suitable for different types of research
and analysis than could be contemplated previously.
This means that more data will be coming at us from many different directions. But what types of things do people want to
do with these data? Well, for starters, some typical objectives include:
- Providing early warnings and supporting rapid assessment of potential safety issues
- Exploring and verifying safety signals in real-world patient populations
- Better understanding of disease characteristics, disease progression, and treatment outcomes, which in turn may help guide
the development of new or improved therapies
- General data mining for interesting new relationships (e.g., dependencies, interactions) that may be suggested by the data.
- However, there are many issues to confront first. Reliable de-identification is critical, yet by de-identifying data, it often
becomes difficult to link together related records from different data sources—such as tying patient records with prescriptions
and lab results.
And health care data are a valuable commodity, often expensive to acquire, from many different providers using many different
proprietary formats. Such formats are not always suitable for analysis, with less than optimal contemporaneity.