|
|
||||||||
Guest Access | Sign In via User Name/Password |
|||||||||
* From the Department of Medicine (Drs. Roberts, Farber, and Knox), Division of Pulmonary, Critical Care, and Occupational Medicine, Indiana University School of Medicine, Indianapolis, IN; and the Center for Biostatistics (Mr. Phillips) and Department of Medicine (Drs. Bhatt, Mastronarde, and Wood), Division of Pulmonary, Critical Care, and Sleep Medicine, The Ohio State University Medical Center, The Ohio State University, Columbus, OH. The authors have reported to the ACCP that no significant conflicts of interest exist with any companies/organizations whose products or services may be discussed in this article.
Correspondence to: Karen L. Wood, MD, 201 Davis Heart and Lung Institute, 473 W Twelfth Ave, Columbus, OH 43210; e-mail: wood.555{at}osu.edu
Abstract
Background: The American Thoracic Society recommends using the lower limit of normal (LLN) method to diagnose obstructive lung disease. However, few studies have investigated the clinical relevance of these recommendations. We compared the LLN derived from available data sets to a fixed ratio (FEV1/FVC, < 75% or 70%) and also to the FEV1/FVC percent predicted ratio to determine the impact of changing the FEV1/FVC "cutoff" on the spirometric diagnosis of obstructive lung disease.
Methods: FEV1, FVC, FEV1/FVC ratio, age, race, sex, height, and weight were recorded from 1,503 pulmonary function tests. Predicted values were calculated using the Third National Health and Nutrition Examination Study data set (Hankinson), and reference values from studies by Crapo, Knudson, and Morris. In addition, the LLN of the FEV1/FVC ratio was calculated for the Hankinson and Crapo reference values.
Results: The number of studies interpreted as obstructed varied from 37% using the Hankinson data set to 55% using the 75% fixed ratio method. Comparing the LLN method vs the 70% fixed ratio method resulted in 7.5% (Hankinson LLN vs 70% fixed) and 6.9% (Crapo LLN vs 70% fixed), which were discordant results. Age was the strongest predictor of discordance, and 16% of subjects > 74 years of age had discordant results comparing Hankinson values to the 70% fixed method.
Conclusion: At the extremes of age and height, a large number of spirometry test results will be interpreted as showing an obstructive defect if a 70% fixed ratio method is used for interpretation compared with the LLN derived from the Hankinson data set.
Key Words: airway obstruction lower limit of normal pulmonary function testing spirometry
The diagnosis of obstruction on the basis of pulmonary function test (PFT) results is determined by the finding of a reduced FEV1/FVC ratio. The traditional teaching has been to define obstruction by a ratio of less than a certain percentage (usually 70 to 75%). Since FEV1/FVC ratio is inversely proportional to age and height, using a fixed percentage would be expected to "over call" obstruction in very old or tall subjects and "under call" obstruction in the young or very short subjects. As such, the American Thoracic Society (ATS) guidelines recommended12 that a statistically derived lower limit of normal (LLN) should be used in lieu of a fixed ratio. The LLN can be defined as a statistically derived level below which a value is considered to be abnormal. Reference data sets usually give the method to calculate LLN, and it is usually based on confidence intervals (CIs) or the fifth percentile. If it is not given, it can be calculated by the standard error of the estimate. Using this method of PFT result interpretation, a predicted value as well as the LLN for a patient can be calculated, below which an interpretation of obstruction can be made. It has long been recognized that it is more statistically sound to use the LLN over a fixed percentage, especially in the extremes of a population (ie, very old or young subjects, and very tall or short subjects) where the errors become more frequent. Interestingly, few studies34 have looked at the clinical relevance of using different methods for the interpretation of airflow obstruction. Additionally, many centers still teach PFT interpretation using a fixed ratio (usually calling a fixed FEV1/FVC ratio of < 70% to 75% as obstructed), and many guidelines, including the National Heart, Lung, and Blood Institute/World Health Organization Global Initiative for Chronic Obstructive Lung Disease (GOLD) Workshop Summary,5 as well as the ATS/European Respiratory Society (ERS) position paper6 on the standards for the diagnosis and treatment of COPD continue to define obstruction based on a fixed FEV1/FVC ratio. Another method of diagnosing airflow obstruction, besides the LLN or the fixed ratio, that is sometimes used is the FEV1/FVC percent predicted. The ERS 1995 consensus statement7 regarding COPD defined obstruction as an FEV1/FVC ratio of < 88% predicted for men and < 89% predicted for women.
Another confounding variable in PFT interpretation is the choice of the reference data set. Newer data sets have become available that have a much larger sample population with a broader age range and ethnic/racial diversity than was previously available.8 Many sources, including the new ATS/ERS consensus statement on PFT interpretation,2 now recommend that the data set reported by Hankinson et al8 be used routinely for spirometry interpretation910; however, many academic and private institutions still use older data sets. The classification of the severity of obstruction is usually based on percent predicted FEV1 and therefore would be expected to differ depending on the reference set. It is unclear in clinical practice whether the use of a different data set changes the interpretation in a statistically significant number of PFTs. We performed a retrospective evaluation of PFT interpretation at three hospitals in two academic medical centers to define the potential misclassification of patients as having obstructive lung disease based on the LLN compared to a fixed percentage. In addition, we examined the different reference data sets in order to determine the discordance in the designation of severity.
Materials and Methods
Consecutive spirometry tests from three hospitals at two academic medical centers between December 1, 2003, and February 29, 2004, were reviewed. Results were taken from the University Hospital at The Ohio State University, and from Indiana University Hospital and Wishard Memorial Hospital, both of which were a part of the Indiana University Medical Center. All PFTs were performed and reported with the goal of meeting ATS standards for acceptability and reproducibility.11 Tests were reviewed, and those that were felt to be inaccurate or uninterpretable were excluded from analysis. FEV1, FVC, and FEV1/FVC ratio were entered into a database (ACCESS; Microsoft; Redmond, WA). Demographic features were also recorded, including age, race (self-reported), sex, height, and weight. Because no protected health information was recorded and the data were obtained from an existing database, the institutional review boards at both institutions approved this study as exempt research, and patient consents were not obtained. Predicted and percent predicted values were calculated for FEV1, FVC, and FEV1/FVC ratio using reference values from Hankinson et al (the Hankinson data set),8 Crapo et al (the Crapo data set),12 Knudson et al (the (the Knudson data set),13 and Morris et al (the Morris data set).14 In addition, the values for LLN of the FEV1/FVC ratio were calculated for the Hankinson and Crapo data sets using their published equations. Airflow obstruction was diagnosed by a fixed FEV1/FVC ratio of < 75% or 70%, or by finding an FEV1/FVC ratio less than the predicted LLN derived by the methods of Hankinson et al8 or Crapo et al,8 or by having an FEV1/FVC ratio of < 88% predicted for men and < 89% predicted for women. For severity interpretation, the results for FEV1 and FVC were race-corrected for African-Americans by 12% for the data sets of Knudson, Morris, and Crapo, based on ATS recommendations. The Hankinson data set has separate prediction equations for ethnicity/race, which were used in this study, and does not require a separate correction factor for African Americans.
Statistical analysis was performed using a statistical software package (STATA, version 8; Stata Corporation; College Station, TX). Comparisons of the methods used for defining obstruction were analyzed by independently comparing both types of discordant groups to the concordant group. For example, when comparing the Hankinson method to the fixed method, discordant group A is where Hankinson diagnoses obstruction and the fixed method does not, and discordant group B is where Hankinson method sees a normal condition and the fixed method diagnoses obstruction. Differences in severity interpretation were examined by comparing the level of severity of obstruction using the four different reference sets in the 600 PFT results that were classified as obstructed based on the 70% fixed method. Initially, multiple comparisons were made to the concordant group with analysis of variance examining the variables of age, race, gender, height, and weight. Significant variables were then analyzed using multinomial logistic regression analysis15 looking for independent effects of these variables.
Results
A total of 1,503 PFT results were collected and analyzed from the 3-month period. A total of 1,003 were from The Ohio State University Medical Center and 500 from the Indiana University Medical Center (Wishard Memorial Hospital, 248 PFT results; Indiana University Medical Center, 252 PFT results). Table 1 shows the distribution of subjects.
|
31 years of age.
|
|
|
|
In addition to comparing the LLN to the 70% fixed method for the diagnosis of obstruction, we also looked at the influence of the reference data set on the classification of severity of obstruction. Using the GOLD criteria, we took all PFT results with an FEV1/FVC ratio of < 70% and called them mild obstruction if the FEV1 was > 80%, moderate if the FEV1 was between 30% and 80%, and severe if FEV1 was < 30%. We analyzed this for four different data sets (the Hankinson, Morris, Knudson, and Crapo data sets). Because the Hankinson data set is the largest one available, we chose to compare the other three data sets to the Hankinson data set. Table 4 shows the percentage of records in which the severity interpretation differed among the data sets. While severity classifications differed < 5% of the time when comparing the Hankinson data set to either the Crapo or Knudson data set, 11% of PFT results that were diagnosed as obstruction using GOLD criteria had differences in the classification of severity if using the Hankinson reference set vs the Morris reference set. Interestingly, all of the discordant results were ones in which the Hankinson data set diagnosed obstruction at a higher level of severity than the Morris data set. A logistic regression analysis of the Hankinson data set vs the Morris data set to identify the predictors of who would be in this group showed only age as a significant predictor. Again, age was entered as a quadratic term in order to satisfy the model assumptions; consequently, the OR was not constant with age. The odds of an 80-year-old subject having the discordant pair is 1.96 times higher than having the concordant pair compared to that of a 50-year-old subject (p = 0.010), while the odds that a 30-year-old subject will have the discordant pair is 36% (OR, 0.64) less than that of a 50-year-old subject (p = 0.010).
|
The ATS recommends12 using a statistically based method for the determination of LLN, and the newest ATS guidelines2 recommend the use of the Hankinson data set as the data set to be used in the US population. The National Lung Health Education Program recommends that primary care practitioners use the data set of the Third National Health and Nutrition Examination Study (the Hankinson data set) and define the LLN for the purposes of diagnosis of obstruction in screening office spirometry.9 However, despite these recommendations, the use of a fixed FEV1/FVC ratio of < 70% (or even 75%) for diagnosing obstruction is still often taught as the method of PFT result interpretation. Making things slightly more confusing, other published guidelines use a fixed ratio of < 70% in the diagnosis of obstruction. The National Heart, Lung, and Blood Institute/World Health Organization GOLD criteria5 and the most recent ATS/ERS position paper6 on the standards for the diagnosis and treatment of COPD use a fixed ratio of 70% for the diagnosis of COPD. Concern has been raised by some that the use of this method will lead to the overdiagnosis of disease, as the LLN of the FEV1/FVC ratio is < 70% for many older individuals. This was effectively shown in a 2002 study16 of elderly, nonsmoking, healthy participants. The authors demonstrated that of 71 subjects who were > 70 years old, 35% had an FEV1/FVC ratio of < 70%, and it was approximately 50% in the 34 subjects who were > 80 years of age. A study4 using the Third National Health and Nutrition Examination Study data compared five methods of PFT interpretation (self-report for chronic bronchitis and emphysema, GOLD criteria IIa, LLN, ERS guidelines, and a fixed ratio of 70%) and then weighted the results for the general US population. The authors found large differences in the number of people who would be diagnosed as having COPD by the different methods.
In order to determine the impact of using the 70% fixed method vs the LLN method, we examined the percentage of clinical PFT result interpretations that would change based solely on the method chosen. We are aware of only one other study that looked at a hospital-based PFT laboratory and examined the number of times that using the LLN method influenced the PFT result interpretation. That study by Margolis et al3 was in a Veterans Administration hospital population and compared LLN using the Crapo data set to the fixed 70% method. Six percent of their study population (10 of 166 PFT results) had different interpretations of obstruction that were based on the method of interpretation. In our study, comparing the LLN method using the Hankinson data set with the 70% fixed method revealed that 7.5% of all PFT interpretations (112 of 1,503 PFT interpretations) were discordant. When comparing the Crapo method to the 70% fixed method, we found 6.1% discordance, which is in agreement with the results of the study by Margolis et al.3 While this is a sizeable percentage of patients, the impact becomes even more obvious at the extremes of age and height. When looking at the results of tests performed in the group of patients who were
74 years of age, the results of 16% of all PFTs performed would have differed in the diagnosis of obstruction.
As part of this study, we compared other reference sets and methods of PFT interpretation to the Hankinson LLN method. In addition to the National Lung Health Education Program recommendation9 in favor of the Hankinson data set, there is a statistical advantage to using the data set with the largest number of patients containing patients of diverse ages and races. Our findings support the use of the LLN method for PFT interpretation. The Hankinson data set decreases the number of cases diagnosed as obstruction in older individuals. This outcome is predictable as it mirrors the physiologic decline in lung function that is known to occur with age. Conversely, obstructive defects are more commonly diagnosed in younger individuals when using the LLN method as defined by the Hankinson data set, as would be expected when utilizing a statistically supported, population-based reference set instead of a fixed ratio. One limitation to this study is the lack of a "gold standard" for diagnosing obstruction. Most of the patients who had discordant results had an FEV1/FVC ratio close to 70% and the LLN. To clearly demonstrate which test was diagnosing obstruction correctly, the study would had to have included clinical history, radiographic and physical examination findings, bronchodilator responses, or possibly methacholine challenge testing results and other "less standardized" indicators of obstruction on PFTs (including small airway assessments and flow-volume loop examination). In the absence of this, we can simply describe the differences between different methods of interpretation.
Another parameter that was examined in this study was the interpretation of severity of obstruction depending on which data set was used. Almost all classification systems base the severity of obstructive lung disease on FEV1 percent predicted. The American Medical Association guidelines17 for evaluating impairment use FEV1 percent predicted as a factor in the level of pulmonary impairment. The diagnosis of obstruction in these guidelines uses LLN based on the Crapo reference set. Prognostic or therapeutic decisions are also often based on FEV1 percent predicted for the disease severity classification. The current asthma guidelines18 use FEV1 percent predicted as one factor in the disease severity classification, and the GOLD criteria5 rely heavily on the FEV1 percent predicted for COPD classification. For this reason, in the current study we used the GOLD criteria cutoff values for FEV1 percent predicted in our diagnosis of mild, moderate, and severe obstruction. We compared three different data sets to the Hankinson data set and found a difference in severity interpretations in 3.5 to 11% of obstructed spirometry tests. The severity differences among the Hankinson, Crapo, and Knudson data sets were small, and no obvious trend could be noted. When comparing the Hankinson data set to the Morris reference data set, all 11% of the differences were because Hankinson et al8 classified the test results as indicating more severe disease. A limitation to our study is that the GOLD criteria recommend the use of a post-bronchodilator therapy FEV1 percent predicted; however, because we did not have these values for all patients, and to keep the data consistent, we used pre-bronchodilator therapy FEV1 percent predicted values. It is possible that post-bronchodilator therapy values would have varied less among these methods.
In summary, we found that 7.5% of PFT results in this study would be reclassified by using the LLN to determine obstruction instead of the traditional 70% fixed ratio method. This difference increased significantly at the extremes of age and height. Updated recommendations endorse the use of the Hankinson data set as the reference for spirometry interpretation. This study defines the potential impact of the next generation of clinical guidelines, as the use of the Hankinson data set will change a significant number of PFT interpretations. For ease of evaluation and comparison between different centers, the standardization of reference values may play as important a role as the standardization of technique or equipment.
Footnotes
Abbreviations: ATS = American Thoracic Society; CI = confidence interval; ERS = European Respiratory Society; GOLD = Global Initiative for Chronic Obstructive Lung Disease; LLN = lower limit of normal; OR = odds ratio; PFT = pulmonary function test
Received for publication July 21, 2005. Accepted for publication January 7, 2006.
References
This article has been cited by other articles:
![]() |
A. C.-W. Lau, M. S.-M. Ip, C. K.-W. Lai, K.-L. Choo, K.-S. Tang, L. Y.-C. Yam, and M. Chan-Yeung Variability of the Prevalence of Undiagnosed Airflow Obstruction in Smokers Using Different Diagnostic Criteria Chest, January 1, 2008; 133(1): 42 - 48. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |