Rater reliability is the cornerstone of data accuracy for drug trials that depend on clinician-rated subjective instruments
to assess efficacy. Every pharmaceutical company seeks unambiguous evidence that its product is the new best treatment for
a given disorder. Increasing governmental scrutiny of new drug applications combined with rising costs of product development
and rising industry competition are pushing companies to improve the margin of success for their clinical trials.
 PHOTOGRAPHY: GETTY IMAGES
|
Demonstrating efficacy requires a confluence of excellence across multiple areas of clinical trial design and implementation.
Protocol design issues, clinical features of the targeted disorder, the effectiveness of the drug, and study implementation
factors all interact to affect signal detection. Pharmaceutical companies must consider all of these areas with meticulous
attention to effectively demonstrate a significant drug effect.
In particular, measurements of efficacy must be reliable for a trial to achieve statistically significant results. Some clinical
trials employ outcome measures generally considered to be objective in nature, such as blood tests or other laboratory values.
Other types of clinical trials depend on the proficiency of clinician raters to assess the presence and severity of symptoms
based upon subjects' reports during clinical interviews. Clinical trials that rely on subjective assessments either reported
by a patient or collected by a clinician are far more susceptible to interpretation and fluctuations than trials whose efficacy
measurements involve objective changes in laboratory measured variables.
Researchers involved in trials employing objective efficacy measures might be surprised to learn how heavily some clinical
trials depend on the accuracy of a subject's self-report or a clinician rater's distillation of a subject's reported symptoms.
Currently, subjective ratings are the only practical and effective methods available to assess physical and emotional states—such
as depression and pain—despite the almost certain biological origins of these central nervous system disorders. Clinician
rated quality-of-life assessments also depend on patients' reports of their functional and emotional states. Confounding variables
include the ability of subjects to be accurate when reporting their symptoms and symptomatic changes, clinician raters' interviewing
skills, and their ability to be objective and investigative when collecting, assessing, and reporting accurate data from their
subjects.
This article examines the rater reliability landscape in trials utilizing clinician-administered, clinician-rated outcome
measures. Challenges to the consistent acquisition of data, guidance considerations, and methods to improve the outcome will
also be addressed.
Measurement challenges
Obtaining reliable results with clinician-administered and -rated subjective measures poses unique challenges to the clinical
trial industry. Variation in this type of subjective scale administration technique and scoring is widespread.1 Generally, pharmaceutical companies select the trial sites that they believe will be best able to deliver accurate results.
The importance of integrity at the level of the site cannot be overestimated. Previous experience with sites gives pharmaceutical
companies and clinical research organizations useful information regarding the investigators at each site. While financial
and time pressures may generate enrollment bias, sponsors and clinical research organizations work diligently to prevent this
from occurring.
At the designated sites, clinical investigators may interview and rate subjects themselves or delegate one or both of these
tasks to other clinicians at the site. When investigators delegate subject rating to other clinical staff members at their
site, they rarely have time to supervise the scale-specific interviews and scoring on a regular basis. Raters in the United
States, when examined as a group, have diverse educational backgrounds and varied experiences with patients in any given disease
population.2
In addition, endpoint selection should be predicated on both regulatory pathways and scientific precedence—
reliable, valid measurements are a requirement.