|
|
||||||||
Guest Access | Sign In via User Name/Password |
|||||||||
* From the Department of Family Practice, Medical College of Virginia, Fairfax, VA 22033.
| Introduction |
|---|
|
|
|---|
| Definition of Focus |
|---|
|
|
|---|
Target Condition
The condition of interest was VAP. Pneumonia in patients not
receiving ventilatory support (even those in critical care settings)
was beyond the scope of the panel review. Although this report focuses
on the sensitivity and specificity of tests in detecting VAP, an
equally important outside consideration is the accuracy of these tests
in identifying pathologic organisms of clinical significance. The use
of some tests that are highly sensitive in detecting VAP results in
overdiagnosis due to the identification of pathologic organisms that
are unrelated to the patients illness, which, thereby, triggers
unnecessary antibiotic therapy.
Patient Population
Included in the patient population were immunocompetent adults
receiving ventilatory support in hospital or long-term care settings.
Children, adolescents, and immunocompromised patients, including
patients with AIDS, were excluded.
Providers
The target audience for the guidelines included pulmonary medicine
specialists and other physicians, such as internists, surgeons,
anesthesiologists, and infectious-disease specialists whose critical
care patients require ventilatory assistance. Other providers who care
for such patients, such as critical care nurses and respiratory
technicians, also may find the guidelines useful.
Interventions
The panel limited its review to the following diagnostic areas:
clinical features; chest radiography; culture or Grams stain;
endotracheal aspiration specimens; antibody coating; elastin fiber
assessment; bronchoscopic BAL specimens; protected-specimen brush (PSB)
specimens; and blinded invasive diagnostic procedures. Other
diagnostic areas were not covered. Although the effectiveness of
treatment for VAP was beyond the scope of the project, it is critically
important in evidence-based guidelines for VAP and cannot be ignored in
evaluating the use of diagnostic tests. The indications for testing and
effectiveness of test procedures are linked to whether the test
information will influence treatment and patient outcomes, and if so,
in what way. These matters and the investigation of treatment failures
in VAP are discussed in this document.
Measures of Effectiveness
The panel could not address whether testing improves health
outcomes or whether the benefits of testing outweigh risks, although
some chapters discuss the potential risks of certain diagnostic
procedures. Ultimately, the health outcomes of diagnostic testing must
be addressed to formulate evidence-based guidelines on treating VAP.
Admissible Evidence
Evidence considered relevant to the review were prospective or
retrospective studies of diagnostic testing in immunocompetent adults
with VAP, published after 1966 in English-language reports. To be
admissible, studies had to provide data on sensitivity or specificity,
or the raw data for calculating them, and a reference standard for how
VAP was defined. Also included were studies of the epidemiology of VAP
and the risk factors involved. Non-English-language publications and
retrospective studies were excluded.
| Review of Evidence |
|---|
|
|
|---|
Literature Search
The MEDLINE database was searched for articles published from 1966
through 1995 by exploding the term "pneumonia" and the MESH terms
"cross infection/artificial respiration" or the text words
"ventilator associated pneumonia." Citations in this set were
cross-referenced with articles retrieved by exploding the text word
"diagnosis," MESH terms "sensitivity and specificity," and text
words "BAL," "bronchoscopy," "protected brush catheters,"
"predictive value," and "likelihood ratio." Results of the
computerized search were supplemented by examining personal files,
other studies known to panel members, and reference lists of all
primary studies and review articles retrieved in locating relevant
studies.
Analysis of Individual Studies
The quality of individual studies was judged using specific
criteria for evaluating internal and external validity. Criteria for
judging internal validity included the following: sample size,
selection bias, definition of interventions and outcomes, and
confounding variables. Criteria for judging external validity related
to how well the results could be generalized to patients and conditions
outside the study settings. Several central principles in evaluating
diagnostic test performance, outlined below in the section
"Principles for Evaluating Diagnostic Test Performance," were
especially important in judging study quality.
Grading systems for judging the quality of evidence typically identify randomized, controlled trials as the "gold standard," followed by controlled observational studies, descriptive epidemiology studies, and case reports. This paradigm is not useful in evaluating studies of test accuracy, because randomized, controlled trials are not necessarily the best setting for evaluating diagnostic test performance. Therefore, this report relies on narrative descriptions of study quality, rather than on rating schemes.
Synthesis of the Results
The evidence was summarized in narrative text and evidence tables.
In addition to presenting the results of the studies, the tables
compare the study designs according to the panels criteria for
judging quality. Data on the sensitivity and specificity of tests were
not pooled through meta-analysis to obtain an overall estimate of test
performance. The significant variability in research methods, study
populations, and definitions across studies made such a synthesis
invalid.
| Development of Recommendations |
|---|
|
|
|---|
A: Recommendation based on direct scientific evidence;
B: Recommendation based on scientific evidence, supplemented by expert opinion;
C: Recommendation based on expert opinion alone; and
D: There is no definitive evidence or consensus opinion.
| Outside Review |
|---|
|
|
|---|
| Principles for Evaluating Diagnostic Test Performance |
|---|
|
|
|---|
| Validity |
|---|
|
|
|---|
Two closely related measures are positive and negative predictive value. Positive predictive value (PPV) is the proportion of patients with an abnormal test result who have the condition (the denominator is the person with positive test results). Negative predictive value is the proportion of patients with normal test results who do not have the condition (the denominator is the person with negative test results). The formulas for these tests are presented in Table 1 .
|
For example, Sutherland et al12 found that the presence of fever, leukocytosis, asymmetric radiographic infiltrates, or purulent tracheal aspiration specimens had a sensitivity of 100% in detecting VAP (ie, no cases were missed). However, because these criteria occur commonly in other diseases, specificity was only 4%. If two of these criteria were required to make the diagnosis, sensitivity decreased to 69% but specificity increased to 18%. If four criteria were required, sensitivity was only 6% (because few patients with VAP have all four clinical features) but specificity was 96% (patients without VAP are unlikely to have all four features).
The sensitivity and specificity rates reported in this review vary dramatically across studies, in part because of differences and imperfections in study design. Because sensitivity and specificity rates are fractions, underreporting in the numerator or denominator can distort true values. For example, sensitivity can be overestimated if the number of patients with the condition who have negative test results is underrepresented.
Consider a study of the sensitivity of chest radiography in detecting autopsy-confirmed pneumonia in 100 critical-care patients. The chest radiographs detect pneumonia in 90 patients, giving a sensitivity of 90%. These patients are obviously drawn from a subset of deceased critical care patients, however, and the question of interest is the sensitivity of the test in all critical care patients, not just those who die. Patients who survive are less likely to have pneumonia or abnormal findings on chest radiographs. Suppose that the 100 patients were drawn from a total of 500 critical care patients, and only 50 of the 400 patients who lived had chest radiograph results that were consistent with pneumonia. The true sensitivity of the test would be only 28% (140/500; Table 2 ).
|
Other factors that affect the validity of the numerator include the administration of antibiotic treatment to patients when cultures are taken, temporal separation in measurement (eg, using chest radiographs performed several days before death to measure correlations between radiographic and autopsy findings), and inadequate measurements (eg, not documenting the presence of squamous epithelial cells in BAL fluid or PSB samples to indicate the degree of upper airway contamination).
Sensitivity and specificity are best determined when results are classified in a binary fashion as positive or negative. In some studies reviewed in this report, investigators included a third category, "indeterminate results," thereby confounding sensitivity and specificity calculations. To achieve consistency in our review, we treated indeterminate results as negative and recalculated the sensitivity and specificity accordingly. Therefore, our calculations sometimes differ from those of authors who treated these as positive results or ignored them.
Denominator Error: The Reference Standard
The validity of reported sensitivity and specificity rates is
highly dependent on the quality of the reference standard,
the test or criteria used to define the presence or absence of the
target condition. This is an important problem in pneumonia. If the
presence of pneumonia is defined on the basis of arguable criteria
(eg, air bronchogram signs, high colony count on BAL or PSB
culture, or clinical improvement with antibiotic treatment), 95%
sensitivity of a diagnostic test may have little meaning. The same test
might perform poorly if a more reliable reference standard, such as
autopsy confirmation, were used. In this review, we assume that the
presence of a radiologic infiltrate or purulent sputum captures all
patients with VAP, but we recognize the limitation of this assumption.
The chapters that follow include examples of the limitations of such inferences. Studies of the sensitivity and specificity of clinical criteria for diagnosing VAP (eg, fever, leukocytosis, and purulent secretions) often rely on the presence of one or more of these findings as the reference standard. The fact that patients with autopsy-confirmed VAP often are not treated with antibiotics37,38 underscores the limitations of relying on clinical criteria for defining the presence of disease. Similarly, an abnormal finding on a chest radiograph is an imperfect reference standard. For example, VAP can occur in patients with normal findings on chest radiographs, and findings suggestive of VAP are common in patients without the disease.
Blinding
The interpretation of test results and chest radiographs can be
biased if observers know the clinical circumstances of the case or the
diagnostic suspicions of the treating physician. In most studies of
diagnostic accuracy based on endotracheal specimens or BAL,
interpreters of the test results knew whether the reference standard
was abnormal, which is a potential source of measurement bias. Better
studies of test accuracy, therefore, include blinding, in
which the evaluators are unaware of the patients identity and
clinical history. Measurement validity also can be improved, especially
for subjective parameters, by having independent observers perform
multiple assessments. Unless stated otherwise, investigators in this
review were not blinded: ie, those who interpreted
diagnostic test results may have known whether the patient had or did
not have VAP.
PPV
PPV, the proportion of those with a positive test result who have
the disease of interest, is dependent on the prevalence (or pretest
probability) of the disease in the population being tested. A test that
has high PPV (ie, a low proportion of false positives) in
settings where the disease is common may have a low PPV when used to
test patients at low risk for the disease. The dramatic influence of
pretest probability on PPV is illustrated in Table 3
.
|
Thus, unlike sensitivity and specificity, PPV is not a constant value that applies to the test from place to place. It can only be extrapolated to populations with a similar prevalence or pretest probability. PPV values reported without prevalence data, therefore, have little meaning. The incidence of VAP in study populations ranges from 15 to 74%,11,12,20,3841 so reports of the accuracy of tests must be interpreted with caution. The potential for selection bias deserves special attention. Several studies of the diagnostic accuracy of chest radiography, for example, included in the denominator only cases of suspected VAP,10 introducing a selection bias that would tend to exaggerate the PPV.
Likelihood Ratio
A useful tool for integrating this information is the
likelihood ratio (LR), which is defined as the sensitivity divided by
(1 - specificity). The LR for the example in Table 1
would be
0.9/0.1, or 9.0. This means that an abnormal result on this test would
be nine times more likely in patients with pneumonia than in patients
without pneumonia. Like sensitivity and specificity, the LR is
independent of the prevalence of the disease. It is useful because it
demonstrates the added predictive value of a test as thresholds change,
as is shown in Table 4
.41a
|
| Reproducibility |
|---|
|
|
|---|
| Footnotes |
|---|
This article has been cited by other articles:
![]() |
A. S. Michalopoulos, S. Geroulanos, and S. D. Mentzelopoulos Determinants of Candidemia and Candidemia-Related Death in Cardiothoracic ICU Patients Chest, December 1, 2003; 124(6): 2244 - 2255. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |