(Chest. 2005;128:2490-2496.)
© 2005
American College of Chest Physicians
Clinical Prediction Model To Characterize Pulmonary Nodules*
Validation and Added Value of 18F-Fluorodeoxyglucose Positron Emission Tomography
Gerarda J. Herder, MD;
Harm van Tinteren, MSc;
Richard P. Golding, MD;
Piet J. Kostense, PhD;
Emile F. Comans, PhD;
Egbert F. Smit, PhD and
Otto S. Hoekstra, PhD
* From the Departments of Pulmonary Medicine (Drs. Herder and Smit), Nuclear Medicine (Drs. Comans and Hoekstra), Radiology (Dr. Golding), and Clinical Epidemiology & Biostatistics (Dr. Kostense), VU University Medical Centre, Amsterdam, the Netherlands; and the Comprehensive Cancer Centre (Mr. van Tinteren), Amsterdam, the Netherlands.
Correspondence to: Otto S. Hoekstra, PhD, Departments of Nuclear Medicine and Clinical Epidemiology & Biostatistics, VU University Medical Centre, De Boelelaan 1117, 1081 HV Amsterdam, the Netherlands; e-mail: os.hoekstra{at}vumc.nl
 |
Abstract
|
|---|
Background: The added value of 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) scanning as a function of pretest risk assessment in indeterminate pulmonary nodules is still unclear.
Objective: To obtain an external validation of the prediction model according to Swensen and colleagues, and to quantify the potential added value of FDG-PET scanning as a function of its operating characteristics in relation to this prediction model, in a population of patients with radiologically indeterminate pulmonary nodules.
Design, setting, and patients: Between August 1997 and March 2001, all patients with an indeterminate solitary pulmonary nodule who had been referred for FDG-PET scanning were retrospectively identified from the database of the PET center at the VU University Medical Center.
Results: One hundred six patients were eligible for the study, and 61 patients (57%) proved to have malignant nodules. The goodness-of-fit statistic for the model (according to Swensen) indicated that the observed proportion of malignancies did not differ from the predicted proportion (p = 0.46). PET scan results, which were classified using the 4-point intensity scale reading, yielded an area under the evaluated receiver operating characteristic curve of 0.88 (95% confidence interval [CI], 0.77 to 0.91). The estimated difference of 0.095 (95% CI, 0.003 to 0.193) between the PET scan results classified using the 4-point intensity scale reading and the area under the curve (AUC) from the Swensen prediction was not significant (p = 0.058). The PET scan results, when added to the predicted probability calculated by the Swensen model, improves the AUC by 13.6% (95% CI, 6 to 21; p = 0.0003).
Conclusion: The clinical prediction model of Swensen et al was proven to have external validity. However, especially in the lower range of its estimates, the model may underestimate the actual probability of malignancy. The combination of visually read FDG-PET scans and pretest factors appears to yield the best accuracy.
Key Words: clinical prediction model 18F-fluorodeoxyglucose positron emission tomography solitary pulmonary nodules
 |
Introduction
|
|---|
Radiologically indeterminate solitary pulmonary nodules (SPN) are a diagnostic challenge in pulmonary medicine. Currently, most SPNs are discovered by plain chest films. With the introduction of CT screening for lung cancer, the number of SPNs will strongly increase. Unfortunately, after a full noninvasive evaluation the diagnosis may still be unclear.
One comprehensive cost-effectiveness analysis1 proposed a diagnostic approach that strongly relied on clinical risk assessment. This probability estimation was based on clinical as well as radiologic parameters, and has been developed and preliminarily validated in a US population.2 The cost-effectiveness analysis included the potential role of 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) scanning. However, the criteria for judging test results with PET scanning are heterogeneous,3 and standardization would be desirable. The added value of FDG-PET scanning as a function of pretest risk assessment still needs to be established.
The aims of the present study were twofold, as follows: first, to obtain an external validation of the prediction model; and second, to quantify the potential added value of FDG-PET scanning as a function of its operating characteristics in relation to this prediction model in a population of patients with radiologically indeterminate pulmonary nodules.
 |
Materials and Methods
|
|---|
Between August 1997 and March 2001, all patients with an indeterminate SPN, which had been detected during normal clinical work in both university and community hospital settings, who had been referred for FDG-PET scanning were retrospectively identified from the database of the PET center at the VU University Medical Centre. In our database, the characteristics of all patients are registered using a modified version of the American College of Radiology Index for Radiologic Diagnoses.
An independent experienced radiologist (RPG), who was blinded to clinical pretest data, FDG-PET results, and outcome, reviewed all CT scans. Patients were eligible for the study if the SPN was
30 mm in diameter on the CT scan and without typically benign calcifications. Patients with prior malignancies within the past 5 years before PET scanning, an unknown history of malignancy or without a definitive clinical diagnosis, or patients lost to follow-up were not eligible.
All medical records were reviewed to obtain the following data: age, gender, smoking status (current or former cigarette smoker, number of pack-years), history of malignancy (date and kind of malignancy), pathology (conclusion and date), and last date of clinical and radiologic follow-up (including disappearance of the SPN or decreased size, no growth, and growth).
Imaging
CT Scan:
Spiral CT scanning was performed with an axial slice thickness of 10 mm in 96 patients, 5 mm in 2 patients, and 4 mm in another patient. In seven patients, a high-resolution CT scan (1 mm axial slice thickness) was performed. IV contrast material was used at some but not at all institutions.
Images were analyzed with mediastinal settings as well as pulmonary parenchymal settings. The SPN diameter (ie, the mean of diameters in the transverse plane, in millimeters), its location (ie, upper lobe or elsewhere), and the presence of spicula (ie, < 50% or
50% of the circumference) were recorded.
FDG-PET Scan:
PET scanning was performed with a dedicated full-ring BGO scanner (ECAT EXACT HR+; CTI/Siemens; Knoxville, TN). Emission scans, were acquired in the two-dimensional mode (5 to 7 min per bed position), approximately 60 min after IV injection of 370 MBq of FDG. Patients were asked to fast for at least 6 h prior to undergoing the PET scan. All scans were corrected for decay, scatter, and random artifacts, and were reconstructed using ordered subset expectation maximization with two iterations and 16 subsets followed by postsmoothing of the reconstructed image using a Hanning 0.5 filter, resulting in a transaxial spatial resolution of 7 mm at full-width half-maximum.
One experienced nuclear medicine physician (EFC) reviewed all PET scans. A visual analysis of the FDG-PET scan was performed with the reviewer blinded to patient outcome. To simulate the usual reporting practice, the localization and diameter of the lesion were provided. The intensity of FDG uptake was scored using a 4-point scale (0, absent; 1, faint; 2, moderate; or 3, intense). Interobserver variation of this classification system was assessed by asking seven relatively inexperienced nuclear medicine physicians, who were blinded to all clinical and radiologic information other than the notation "SPN," to score a randomly chosen subset (25% of the present material). Semiquantitative analysis was performed using a tumor/normal lung tissue ratio.
Diagnosis
Final classification was based on histopathologic findings or clinical and radiologic follow-up. The time of follow-up was defined as the time between PET imaging and histologic diagnosis or the date of the last radiologic follow-up. Radiologic follow-up typically consisted of repeat chest CT scans. Lesions were classified as benign in case of benign pathologic findings, the disappearance of the lesion at radiologic follow-up, or no change in size within an observation period of at least 1 year. Lesions were considered to be malignant on the basis of pathology or growth at radiologic follow-up.
Clinical Prediction Model According to Swensen and Colleagues
This model expresses the probability of malignancy as a function of three clinical and three radiographic variables as follows:
where x = 6.8272 + 0.0391 (age) + 0.7917 (cigarettes) + 1.3388 (cancer) + 0.1274 (diameter) + 1.0407 (spiculation) + 0.7838 (upper); e is the base of natural logarithms; age is the patients age (in years); cigarettes is 1 if the patient is a current or former smoker (otherwise, 0); cancer is 1 if the patient has a history of extrathoracic cancer that had been diagnosed > 5 years ago (otherwise, 0); diameter is the diameter of the SPN (in millimeters), spiculation is 1 if the edge of the SPN has spicula (otherwise, 0); and upper is 1 if the SPN is located in an upper lobe (otherwise, 0). The model was validated for an American population with a 26.4% prevalence of malignancy.
Statistical Analysis
The model fit was assessed by a goodness-of-fit for binary logistic regression,4 as implemented by Harrell et al,5 where high p values indicate a well-calibrated model. The predictive ability was expressed by various statistics, among which were the area under the receiver operating characteristic (ROC) curve. The areas under the curve (AUCs) were compared using the method described by DeLong et al,6 and logistic regression models were compared using the Akaike information criterion (AIC). First, the accuracy of the prediction model of Swensen et al2 was determined for the study population. Second, the characteristics of FDG-PET scanning, as a univariate test with four categories, were calculated. Finally, the added value of FDG-PET scanning to the model of Swensen et al2 was explored. A nomogram was constructed using the pretest probability of the model of Swensen et al2 combined with the value of FDG-PET scanning. The interobserver variation of the FDG-PET scan classification was analyzed with intraclass correlation coefficients. Extensive use was made of programs developed (S-plus, version 6.2; Insightful; Seattle, WA) by Harrell et al.5
 |
Results
|
|---|
In total, 106 eligible patients were identified, of whom 61 (57.5%) proved to have malignant nodules. Referring physicians were pulmonologists from university hospitals (n = 25) and community hospitals (n = 81). Fifty-eight percent of the patients were men, and their mean age was 64 years (age range, 32 to 85 years) [Table 1
] . The diagnosis of malignancy was based on histopathologic results in 55 patients and on radiologic growth of the lesion in 6 patients. The diagnosis of a benign lesion was based on the stabilization or spontaneous decrease in the size of the lesion on a follow-up CT scan in 40 patients and on the histopathologic results in 5 patients. In patients with radiologically stable SPNs (n = 23), the median follow-up period was 646 days (interquartile range, 413 to 925 days), and only 6 patients had a follow-up period of < 365 days (with a minimum of 203 days) vs 205 days (interquartile range, 143 to 398 days) in the 17 patients with shrinking or disappearing lesions.
Validation of the Model of Swensen and Colleagues
The goodness-of-fit statistic for the model indicated that the observed proportion of malignancies did not differ from the predicted proportion (p = 0.46). The probability of malignancy was calculated using the complete model (eg, variables with specified coefficients) of Swensen et al. The ROC-AUC was 0.79 (95% confidence interval [CI], 0.70 to 0.87). A calibration curve of the model including a series of statistics is shown in Figure 1
.
Operating Characteristics of PET Scanning
PET scan results that were classified using the 4-point intensity scale reading (Fig 2
) yielded an ROC-AUC value of 0.88 (95% CI, 0.77 to 0.91). A tumor/normal tissue ratio for these data showed identical AUC-ROC values (AUC, 0.87; 95% CI, 0.80 to 0.94). All other analyses were performed using the 4-point intensity scale reading. The estimated difference of 0.095 (95% CI, 0.003 to 0.193) with the AUC from the prediction of Swensen et al2 was not significant at p = 0.058. Classifying the 6.6% proportion (n = 7) with faintly enhanced FDG uptake as negative yielded a sensitivity of 96.7% (95% CI, 87.6 to 99.4; 59 of 61 patients), a specificity of 71.1% (95% CI, 55.5 to 83.2; 32 of 45 patients), and an accuracy of 86% (95% CI, 77.4 to 91.6). Two nodules without enhanced FDG uptake proved to be a papillary adenocarcinoma (diameter, 30 mm) and a carcinoid tumor (diameter, 10 mm) after pathologic investigation. Thirteen nodules with increased FDG uptake were classified as benign, with histologic diagnoses of fibrosis (one patient), hematoma (two patients), reactive granulomatosis (two patients), radiologic regression (seven patients), or no growth (one patient). The interobserver correlation of visual analysis of FDG-PET scanning using intensity scales was 0.87 (95% CI, 0.79 to 0.93).

View larger version (23K):
[in this window]
[in a new window]
[Download PPT slide]
|
Figure 2.. ROC curves for the prediction model of Swensen et al2 and for a model combining Swensen pretest probability with FDG-PET scan results. Swensen = logistic probability model of Swensen et al2 FDG-PET (4 categories) = FDG-PET scan result in four categories (ie, no uptake, faint, moderate, and intense); Swensen + PET = the logistic model combined with PET scan information.
|
|
PET Scan Result and the Prediction of Swensen et al Combined
PET scanning added to the predicted probability calculated by the model of Swensen et al2 improves the AUC by 13.6 (95% CI, 6 to 21; p = 0.0003). The fitted function to calculate the probability of malignancy based on the model of Swensen et al together with a PET scan is as follows:
with x = 4.739 + 3.691 (percentage of probability by the model of Swensen et al2) + 2.322 (faint uptake) + 4.617 (moderate uptake) + 4.771 (intense uptake). A visual reproduction of the model is given in Figure 3
by means of a nomogram. The corresponding calibration curve displays the relation between the predicted and the actual probability in Figure 4
.

View larger version (12K):
[in this window]
[in a new window]
[Download PPT slide]
|
Figure 3.. Nomogram using information from the clinical prediction model and FDG-PET scan. The probability of malignancy based on the model of Swensen et al2 and the FDG-PET scan result are indicated in the nomogram. First, the patients position on each predictor variable scale is defined. Each scale position has corresponding prognostic points located on the points scale at the top. These two numbers are then summed to arrive at a total points value on the total points axis. A vertical line is then drawn from the total points axis down to the probability to indicate the probability of malignancy.
|
|
 |
Discussion
|
|---|
In 2003, a comprehensive cost-effectiveness decision analysis was published,1 which included the full spectrum of diagnostic and therapeutic options for SPNs. The first stratification of this analysis was based on the result of a clinical risk assessment as provided by a previously developed multivariate logistic regression model.2 It was recognized that this model, which was developed in a North American population with pulmonary nodules discovered between 1984 and 1986 and a prevalence of malignancy of 26.4%,27 required additional external validation. The current study provides validation of this clinical prediction model in a sample of patients with radiologically indeterminate nodules that were collected between 1997 and 2001, with a prevalence of malignancy of 57.5%. Our reported prevalence of 57.5% is more in line with other reports in the literature in which approximately one third of pulmonary nodules were radiologically indeterminate, and, of those, one third of the resected pulmonary nodules were benign.89 However, despite differences in the prevalence and prevailing local epidemiology of underlying diseases compared to the original data set of Swensen et al,2 the model showed a reasonable fit to our data, indicating that the model is robust. The calibration figure shows that the prediction model tended to underestimate the probability of malignancy, particularly at lower probabilities. Interestingly, in a follow-up study by Swensen et al,7 in which the probability estimation of four experienced clinicians was compared with the results of the prediction model, the clinicians tended to overestimate the pretest probability, particularly at lower values of the predicted probability. Obviously, the clinical intuition and judgment of experienced clinicians are most important, but less experienced clinicians may be less accurate, and more objective diagnostic techniques are still warranted.
In our study, visual analysis of FDG-PET scanning proved to be an accurate method of interpretation. The AUC-ROC of FDG-PET scanning compared to the result obtained with the model of Swensen et al2 did not significantly (p = 0.058) improve the predictive value (Table 2
). However, the shape of the ROC curve (for the FDG-PET scan) especially suggested that the finding of actual patients having lung cancer (positive predictive value) improved. It could be argued that this was on the border of significance and reflected a type II error. On the other hand, we think that the actual difference between the PET scan-alone model and the prediction model of Swensen et al2 is of little clinical importance since the parameters of the model of Swensen et al are always available prior to PET scanning.
Dewan et al10 found that dichotomized results for the PET scan as a single test performed better than the standard criteria developed in a model by Cummings et al,11 including baseline prevalence, size, age, and smoking history. However, the series by Dewan et al10 was smaller, and the comparative model was based on Bayesian analysis combining likelihood ratios of test results that were assumed to be conditionally independent while derived from various sources. Our results suggest that, with respect to diagnostic performance, the best results are to be expected from the combined information of clinical assessment and PET scanning (ie, the AUC-ROC showed a significant improvement (p = 0.0003) as did the AIC of the combination models. The limitations of both PET scan studies were their retrospective design as well as the potential for referral bias, which are other reasons for the validation of the results.
Clinical prediction rules and modeling can help to set the indication for PET scanning beyond the almost intuitive reasoning that PET will be most useful in the pretest probability range of 10 to 50%.112 However, the results of complex decision models obviously depend on several assumptions. For example, it is not clear that the required strict pursuit of histopathologic diagnoses can or will be obtained in clinical practice. In fact, it has been claimed that low FDG uptake (ie, the likely false negative ones) in T1 malignant lesions carries a relatively favorable prognosis,13 but this has rightly not been accounted for in the model. Whether patients and clinicians will accept the strategy of watchful waiting in such cases remains to be seen.
Even though we are aware that diagnostic accuracy measures are not directly related to patient outcomes, the information as provided in the present study will at least help to define whether in individual cases the result of PET scanning might affect management. We expect that these limits may not be the same in different clinical situations. Therefore, a logistic model may, apart from calculating posttest probabilities, also help to decide whether PET scanning should be performed in an individual patient. After estimating the pretest probability of cancer, the clinician can assess which (if any) PET scan result will push the diagnostic uncertainty beyond the required limits. Since our analysis was based on patients who were referred for PET scanning, we cannot exclude the possibility of referral bias. Therefore, our model needs validation, but since SPN is a major indication for FDG-PET scanning, this should not be a major problem.
It has been pointed out that FDG-PET studies in coin lesions contain a variety of criteria by which a PET scan result can be assigned a positive result. This is of concern when considering the implementation of the technique. For practical purposes, the quantitative potential of PET scanning is often reduced to semiquantitative measures like the standardized uptake value (SUV), which basically expresses the concentration of FDG uptake in a lesion as a function of the total injected dose. In comparison with visual image analysis, this approach has the conceptual advantage of objectiveness. However, the results of SUV measurements are also prone to heterogeneity due to prevailing differences in data acquisition and reconstruction methodology.14 A visual analysis of FDG-PET scanning was performed because it has been shown that SUV methodology and implementation is less straightforward than has often been assumed. There is still debate about the appropriate normalizations to be used, but, more importantly, it has recently been demonstrated15 that the results of SUVs strongly depend on image reconstruction methodology, level of noise, image resolution, and region of interest definition, so that its use is highly questionable for generic diagnostic purposes. Moreover, one systematic review3 failed to show that semiquantitative image interpretation improves the accuracy of FDG-PET scanning. Finally, the lack of standardization in the current PET scan literature and practice strongly compromises the theoretical advantage of so-called objective measurements. The excellent reproducibility of visual scaling is probably explained by its close association with semiquantitative tumor/nontumor ratios. Our data suggest that the visual assessment of FDG uptake intensity is a robust method. It is controversial whether attenuation correction improves detection. There is general agreement that the localization of abnormalities can be simplified by this correction, but this is not the issue in coin lesion characterization. The downside of attenuation correction is a loss of patient throughput by about 30% due to the time needed for the acquisition of transmission scans necessary to obtain an accurate attenuation map. Even though the calibration of our data with attenuation-corrected scans is required, we do not expect a major impact since our accuracy data nicely fit into the summary ROC curve of the 2001 metaanalysis.3
 |
Conclusion
|
|---|
The clinical prediction model of Swensen et al2 has been proven to have external validity. However, especially in the lower range of its estimates, the model may underestimate the actual probability of malignancy. The visual analysis of FDG-PET scans is a robust and accurate method in radiologically indeterminate SPNs. The combination of visually read FDG-PET scans and pretest factors appears to yield the best accuracy. These results can help to adjust the diagnostic workup in individual situations.
 |
Acknowledgements
|
|---|
We thank V. Bongers, MD; R.A.M.J. Claessens, MD; M.A.L. Edelbroek, MD; D.A.K.C.J.M. Huysmans, MD; R.A. Valdés Olmos, MD; D.E.A. Zanin, MD; and A. Zwijnenburg, MD; who performed a visual analysis of FDG-PET scans using intensity scales to assess the interobserver correlation.
 |
Footnotes
|
|---|
Abbreviations: AIC = Akaike information criterion; AUC = area under the curve; CI = confidence interval; FDG = 18F-fluorodeoxyglucose; PET = positron emission tomography; ROC = receiver operating characteristic; SPN = solitary pulmonary nodule; SUV = standardized uptake value
Received for publication June 15, 2004.
Accepted for publication March 10, 2005.
 |
References
|
|---|
- Gould, MK, Sanders, GD, Barnett, PG, et al (2003) Cost-effectiveness of alternative management strategies for patients with solitary pulmonary nodules. Ann Intern Med 138,724-735[Abstract/Free Full Text]
- Swensen, SJ, Silverstein, MD, Ilstrup, DM, et al The probability of malignancy in solitary pulmonary nodules: application to small radiologically indeterminate nodules. Arch Intern Med 1997;157,849-855[Abstract]
- Gould, MK, Maclean, CC, Kuschner, WG, et al Accuracy of positron emission tomography for diagnosis of pulmonary nodules and mass lesions: a meta-analysis. JAMA 2001;285,914-924[Abstract/Free Full Text]
- Hosmer, DW, Hosmer, T, Le Cessie, S, et al A comparison of goodness-of-fit tests for the logistic regression model. Stat Med 1997;16,965-980[CrossRef][ISI][Medline]
- Harrell, FE, Jr, Lee, KL, Mark, DB Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15,361-387[CrossRef][ISI][Medline]
- DeLong, ER, DeLong, DM, Clarke-Pearson, DL Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44,837-845[CrossRef][ISI][Medline]
- Swensen, SJ, Silverstein, MD, Edell, ES, et al Solitary pulmonary nodules: clinical prediction model versus physicians. Mayo Clin Proc 1999;74,319-329[ISI][Medline]
- Huston, J, III, Muhm, JR Solitary pulmonary opacities: plain tomography. Radiology 1987;163,481-485[Abstract/Free Full Text]
- Lillington, GA Pulmonary nodules: solitary and multiple. Clin Chest Med 1982;3,361-367[ISI][Medline]
- Dewan, NA, Shehan, CJ, Reeb, SD, et al Likelihood of malignancy in a solitary pulmonary nodule: comparison of Bayesian analysis and results of FDG-PET scan. Chest 1997;112,416-422[Abstract/Free Full Text]
- Cummings, SR, Lillington, GA, Richard, RJ Estimating the probability of malignancy in solitary pulmonary nodules: a Bayesian approach. Am Rev Respir Dis 1986;134,449-452[ISI][Medline]
- Fischer, BM, Mortensen, J, Hojgaard, L Positron emission tomography in the diagnosis and staging of lung cancer: a systematic, quantitative review. Lancet Oncol 2001;2,659-666[CrossRef][ISI][Medline]
- Marom, EM, Sarvis, S, Herndon, JE, et al T1 lung cancers: sensitivity of diagnosis with fluorodeoxyglucose PET. Radiology 2002;223,453-459[Abstract/Free Full Text]
- Boellaard, R, Buijs, F, Jong, H, et al Characterization of a single LSO crystal layer high resolution research tomograph. Phys Med Biol 2003;48,429-448[CrossRef][ISI][Medline]
- Boellaard, R, Krak, NC, Hoekstra, OS, et al Effects of noise, image resolution, and ROI definition on the accuracy of standard uptake values: a simulation study. J Nucl Med 2004;45,1519-27[Abstract/Free Full Text]
This article has been cited by other articles:

|
 |

|
 |
 
A. S. Bryant and R. J. Cerfolio
The maximum standardized uptake values on integrated FDG-PET/CT is useful in differentiating benign from malignant pulmonary nodules.
Ann. Thorac. Surg.,
September 1, 2006;
82(3):
1016 - 1020.
[Abstract]
[Full Text]
[PDF]
|
 |
|