|
|
||||||||
Guest Access | Sign In via User Name/Password |
|||||||||
Erasmus Medical Center Rotterdam, the Netherlands
Correspondence to: Adrián V. Hernández, MD, MSc, Erasmus MC, Department of Public Health, Room Ee2010, PO Box 1738, 3000 DR Rotterdam, the Netherlands; e-mail: a.hernandez{at}erasmusmc.nl
To the Editor:
Moss et al (March 2003)1 reviewed the deficiencies in reporting multivariable logistic regression analysis in the pulmonary and critical care medicine literature. They also suggested some potential guidelines to improve this reporting in both descriptive and predictive modeling. We would like to make some further suggestions, with emphasis on predictive models.
Recently, guidelines have been developed to adequately report randomized controlled trials (Consolidated Standards of Reporting Trials2 ), meta-analyses of randomized controlled trials (Quality of Reporting of Meta-analyses3 ), and diagnostic research studies (Standards for Reporting of Diagnostic Accuracy4 ). These guidelines were based on empirical evidence on factors affecting the readers understanding, validity, reliability, and generalizability of the findings. Even though predictive research studies are common in the literature, published guidelines are not sufficiently supported by such empirical evidence.5 6
Adequate reporting of predictive research should be based on elements that reflect how valid and precise the analyses were done. In predictive studies, overfitting is the key problem, which is related to several aspects of the modeling process. We suggest to report the number of candidate variables in addition to the variables in the final logistic model. The risk of overfitting after extensive modeling using many variables is high, especially in small data sets,7 8 and this unfortunately cannot be remedied by standard stepwise selection techniques.5 Also, a description of the choices underlying coding of variables and, in particular, selection of variables are of paramount importance.8 The number of outcome events must also be reported additionally to the number of total observations, because further overfitting is likely if the number of events per candidate variable is low, eg, < 10.5 6 7 8 Attempts of internal validation (eg, cross-validation or bootstrapping) can also reduce overfitting, using statistical "shrinkage" of coefficients.2 8 Further, predictive performance (calibration and discrimination) and internal/external validation should be described.7
We further suggest to avoid the reporting of some technicalities. The report of coefficients, SEs, and p values are not important, since the relevant information can be obtained from the odds ratios and their 95% confidence intervals of the final model variables.7 Moreover, we would not stress the report of collinearity in a predictive model, because we are primarily interested in the predictive performance of the whole model, but not in the regression coefficients of individual variables.7 8 If two variables are strongly correlated, no additional predictive information comes available once one is included in a predictive model.8 Even though we agree that collinearity is important in descriptive modeling, its report does not improve the judgement of the reader.
Predictive modeling using logistic regression analyses is becoming more important in the medical literature. Although Moss et al1 made an important contribution by noting where some deficiencies in reporting are, evidence-based recommendations for a proper reporting are still lacking and are urgently needed.
References
Emory University School of Medicine Atlanta, GA
Correspondence to: Marc Moss, MD, Emory University School of Medicine, Thomas K. Glenn Memorial Bldg, 69 Jesse Hill Jr Dr SE, Atlanta, GA 30303; e-mail: marc_moss{at}emoryhealthcare.org
To the Editor:
Dr. Hernandez and colleagues recommend additional considerations when reporting multivariable logistic regression analyses for predictive models. In our review of the pulmonary and critical care literature, only 6% of articles used multivariable logistic regression modeling in a predictive manner.1 Therefore, we focused our suggested requirements for the proper reporting of descriptive models that identify the effect of an individual variable on a specific outcome while adjusting for differences in other factors. In regard to predictive modeling techniques, we agree that reporting collinearity is less important. Like Hernandez and colleagues, we would encourage those interested in a more complete understanding of methodological standards for predictive modeling strategies to read the articles by Laupacis and colleagues2 in JAMA, and Wasson and colleagues3 in the New England Journal of Medicine.
References
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |