|
|
||||||||
Guest Access | Sign In via User Name/Password |
|||||||||
* From the Departments of Otolaryngology (Drs. Kirby and George) and Medicine (Drs. Danter, Ferguson, and Francovic), The University of Western Ontario, London, Ontario, Canada.
Correspondence to: Kathleen A. Ferguson BSc, MD, FCCP, University Campus, London Health Sciences Centre, 339 Windermere Rd, London, Ontario, Canada N6A 5A5; e-mail: kafergus{at}julian.uwo.ca
| Abstract |
|---|
|
|
|---|
Study design: Retrospective review.
Setting: Regional sleep referral center.
Patients: Randomly selected records of patients referred for possible OSA.
Measurements: The neural network was trained using 23 clinical variables from 255 patients, and the predictive performance was evaluated using 150 other patients.
Results: The prevalence of OSA in this series of 405 patients (293 men and 112 women) was 69%. The trained GRNN had an accuracy of 91.3% (95% confidence interval [CI], 86.8 to 95.8). The sensitivity was 98.9% for having OSA (95% CI, 96.7 to 100), and the specificity was 80% (95% CI, 70 to 90). The positive predictive value that the patient would have OSA was 88.1% (95% CI, 81.8 to 94.4), whereas the negative predictive value that the patient would not have OSA (if so classified) was 98% (95% CI, 94 to 100).
Conclusions: Appropriately trained GRNN has the ability to accurately rule in OSA from clinical data, and GRNN did not misclassify patients with moderate to severe OSA. In this study, use of the neural network could have reduced the number of PSG studies performed. Prospective validation of the neural network for the diagnosis of OSA is now required.
Key Words: artificial neural networks clinical prediction models obstructive sleep apnea screening
| Introduction |
|---|
|
|
|---|
Previous efforts to develop a clinical screening or case-finding instrument for the diagnosis of OSA have been based on patient questionnaires usually combined with anthropometric and physical findings.7 8 9 10 11 12 13 These studies have used a wide variety of statistical approaches and techniques to predict the presence of OSA. These approaches have been limited by poor sensitivity or less-than-optimal specificity (even when sensitivity has been high), by conflicting results between the different studies, and by the lack of prospective validation in some of the models that hinders their generalizability. It is noteworthy that the overall subjective impression of experienced sleep physicians correctly identifies about 50% of the patients with OSA.8 9 Portable home sleep monitoring has also been proposed for the diagnosis of OSA. The potential benefits of home sleep monitoring include reduced costs and the ability to evaluate patients in their usual environment. Potential disadvantages include the problems inherent in unattended monitoring (especially if the patient sets up the equipment) and the time and expense involved if a technician is required to instrument the patient at home. In the latter situation, there may be minimal financial savings over in-laboratory PSG. In addition, many of the home monitoring systems that are available have not been adequately validated. PSG is therefore the diagnostic standard for OSA due to the lack of well-validated clinical or other screening tools. PSG may be expensive and inconvenient for the patient, and it is a labor-intensive, specialized procedure that may have limited accessibility in many jurisdictions. Increasing awareness of the adverse health effects of OSA and the availability of effective therapies have led to increased demand for PSG and, concomitantly, concerns by health insurance providers about increased expenditures for the diagnosis and treatment of OSA.
Artificial neural networks (ANNs) are computer programs modeled after the biological nervous system, and they are capable of recognizing complex patterns in data based on experience. Their application is useful in complex problems because they can analyze a large number of linear and nonlinear variables without the operator knowing or making assumptions about the relationships between the variables. Neural networks are "trained" by presenting a set of data together with the outcomes that the trainer wishes the network to learn. The trained neural network can then be evaluated by inputting similar, but previously unseen, data. This artificial intelligence approach for outcome prediction has been used successfully in other medical applications, including the prediction of acute myocardial infarction in patients presenting to an emergency room physician,14 the diagnosis of pulmonary embolism,15 16 and the predicted length of ICU stay.17 ANNs have been shown to outperform physician impression or prediction18 19 and to equal or exceed traditional statistical modeling in the prediction of outcomes.20 21 This study was conducted to test the hypothesis that a trained generalized regression neural network (GRNN) could accurately classify patients with OSA from clinical data.
| Materials and Methods |
|---|
|
|
|---|
Overnight PSG
PSG was performed on 1 night in all patients. The sleep study
montage included EEG (C3/A2, C4/A1, O2/A1), electro-oculogram,
submental electromyogram, left and right anterior tibialis
electromyogram, ECG, thoracoabdominal motion, oronasal airflow (expired
CO2), and arterial oxygen saturation with pulse
oximetry using an ear probe sensor. The studies were scored manually,
and the total AHI (number per hour total sleep time) was calculated for
the night. Obstructive apneas were defined as the cessation of airflow
for at least 10 s accompanied by ongoing respiratory effort.
Obstructive hypopneas were defined as a reduction in airflow of at
least 50% for at least 10 s accompanied by a reduction in
respiratory effort and by an arousal or an arterial oxygen desaturation
of at least 3%. OSA was defined as an AHI
10/h for the purposes of
this study.
Clinical Data
Forty-five clinical variables from nine categories were recorded
in the database. These variables were chosen on the basis of previously
published screening studies713 and clinical
experience. The categories included demographics (age, gender, marital
status); nocturnal symptoms (frequent awakenings, choking, gasping);
bed partner observations (snoring, witnessed apnea, observed choking,
restlessness); daytime symptoms (unrefreshing sleep, morning headache,
reported excessive daytime sleepiness, Epworth sleepiness scale,
impaired nasal breathing); past medical history (nasal trauma,
hypertension, airway surgery, allergies); medications
(sedative/hypnotics, antidepressants, antihypertensives); social
history (alcohol consumption, smoking history and pack-years); and
anthropometrics (weight, height, body mass index [BMI] in
kg/m2, neck circumference [NC]). Alcohol
consumption was categorized as none, mild (up to 1 drink per day), and
moderate to heavy (> 2 drinks per day). Data from the physical
examination included systolic and diastolic BP, nasal obstruction,
tonsillar enlargement, soft palate and/or uvular enlargement, crowding
of the posterior oral pharynx, and the presence and grade of maxillary
overjet. Data values that were not available in the chart review were
obtained, whenever possible, by calling the patients by telephone or at
a follow-up clinic visit. Less than 1% of the numeric data values were
missing from the database. Missing numerical values (eg, NC)
were recorded as the mean value from the group the subject was in
(eg, women). Missing categorical values were recorded as
negative.
GRNN software was used (NeuralShell Classifier Version 2; Ward Systems
Group; Frederick, MD). Data preprocessing was conducted prior to
neural network training to reduce the number of input variables to an
optimal number of more discriminating factors. The raw data were first
divided into two subsets according to AHI (ie, AHI < 10
and AHI
10). Differences between the subsets were then assessed
using Student's t test for each of the parametric variables
and the Mann-Whitney U test for the nonparametric variables.
Categorical variables were compared by
2
testing. All analyses were performed using appropriate software
(Microsoft Excel Version 5.0 for the Macintosh; Microsoft; Redmond, WA
and Systat Version 5.2 for the Macintosh; SPSS; Chicago, IL). A p value
0.1 was arbitrarily chosen as the cutoff for the continued
inclusion of a given variable in the final
model.22
23
24
25
The 45 clinical variables were reduced
to 23 variables by data preprocessing. For the purpose of ANN training,
the preprocessed database (n = 405) was randomly divided into a
"training" set of 255 patients and a "test" set of 150
patients. During training of the network, the training set was
repeatedly presented to the network until 99.6% of the patients were
learned. The test set of previously unseen cases was then presented to
the trained network, and predictions were compared with actual
outcomes.
The aim of this study was to predict those subjects who had OSA
(ie, AHI
10 from PSG). The accuracy, sensitivity,
specificity, positive predictive values (PPVs), and negative predictive
values (NPVs) were calculated. Accuracy was defined as the
number of times the ANN correctly classified the patient as having or
not having OSA.
| Results |
|---|
|
|
|---|
10) in this series was
69% (53% in women and 76% in men). The demographic and clinical
features of the study population are outlined in Table 1
. There were more men than women in the data set, and on average the
patients were middle aged and overweight. The prevalence of OSA in the
training set and the test set were 72% and 65%, respectively
(p > 0.05). There were small differences between the training
set and the test set. The subjects in the test set had a larger BMI
(p = 0.009) and a higher systolic BP (p = 0.002). There were
proportionately more women in the test set (p = 0.03), and overall,
more subjects were noted to have observed choking in the test set
(p = 0.016).
|
0.0001). The patients with OSA were
also more likely to have a large soft palate and crowded
posterior oral pharynx. Patients with OSA were more likely to report
moderate to severe excessive daytime sleepiness (47.7%) than patients
without OSA (27.4%; p = 0.001), although there was no difference in
the Epworth Sleepiness Scale.
|
|
|
| Discussion |
|---|
|
|
|---|
10/h). To our knowledge, this study is the
first use of a neural network to predict the presence or absence of
OSA. The prevalence of OSA in this population was 69%, with OSA
more common in men than in women. In the overall group, the presence of
OSA was associated with a history of witnessed apneas, observed
choking, and increased smoking and alcohol intake. The patients with
OSA were more obese, had a higher BP, and were more likely to have a
large soft palate and a crowded posterior oral pharynx. Different from
some previously published studies7
8
11
yet compatible
with others,9
reported excessive daytime sleepiness was
associated with the presence of OSA. OSA is an ideal condition for this use of screening or case-finding techniques. OSA is a chronic condition that is highly prevalent in the general population, and many effective therapies are available. If this neural network retains the high level of accuracy with prospective validation, it could be used as a simple clinical screening tool to reduce the number of PSG studies performed in patients without OSA. The novel approach of applying a neural network to the clinically based prediction of OSA allows the incorporation of a much larger number of variables than what are conventionally used in linear or logistic regression techniques. Perhaps more important than the variables themselves is the technique of processing that seeks to identify subtle patterns and relationships. The user is shielded from the multitude of calculations, the technique is not cumbersome, and the prediction is free from bias, fatigue, and personal opinion. In addition, a well-trained neural network can easily handle differences between groups (eg, between men and women) within the same model.
Screening tests are generally designed to favor high sensitivities so that cases are not missed. The development of clinical prediction rules emphasizing optimal sensitivity can be developed by accepting a concomitant decrease in specificity. The financial impact of varying the cutoff points is related to the costs (both financial and human) associated with false positive vs false negative results. Most previous clinical screening studies have attempted to rule in the presence of OSA with a high level of sensitivity in order not to miss significant OSA. When Hoffstein and Szalai9 studied 594 patients referred for possible OSA, they found that several variables from the history and physical examination were predictors of the presence of OSA (age, gender, BMI, witnessed apneas, pharyngeal examination). However, physician subjective impression only correctly identified 51% of the patients with OSA and 71% of the patients without OSA (sensitivity, 60%; specificity, 65%). They concluded that clinical impression alone could not reliably identify patients with or without OSA.
Crocker and colleagues7 used a logistic regression model developed with clinical variables to predict the presence of OSA. They found that age, witnessed apneas, BMI, and hypertension were independently associated with the presence of OSA (AHI > 15/h) in a group of 100 patients. The model had a sensitivity of 92% and a specificity of 51% for the diagnosis of OSA. The prevalence of OSA was only 27% in the initial group and 34% in the prospective test group of 105 patients. In the test group, 33 of 36 patients were correctly classified with OSA. The three patients in the test group who the model misclassified as not having OSA all had an elevated AHI > 40/h. The researchers estimated that they could reduce the number of PSG studies performed by approximately one third, but would have misclassified patients with significant OSA as not having OSA. Viner and coworkers8 developed a logistic regression model in 410 patients; age, gender, BMI, and snoring were retained as significant variables in the model. The prevalence of OSA was 46% in this population. The sensitivity was 94% and the specificity was 28% for the diagnosis of OSA (defined as AHI > 10). The area under the receiver operating characteristic curve was 0.77. The physician's subjective impression of the presence of OSA had a sensitivity of 52% and a specificity of 70% with an accuracy rate of 63%. The application of this model would have allowed them to potentially reduce the number of PSG studies by one third. This model was not prospectively validated. The merits of these two different models are difficult to compare, as the prevalence of OSA in the populations studied were different and relatively low, and this affects the PPV and NPV of the test. The neural network, however, had better sensitivities for the diagnosis of OSA than either of these models, it had better specificities, and it did not miss cases of significant OSA while potentially reducing the number of PSG studies required.
Flemons and colleagues11 evaluated 180 patients with possible OSA, and they developed a model that included NC, hypertension, habitual snoring, and witnessed gasping or choking. The prevalence of OSA was 45% in the group (AHI > 10/h). Likelihood ratios were determined, and the calculated clinical score from the model was used to provide a posttest probability of OSA. This model was superior to physician impression, and it was comparable or superior to previously published models.7 8 Pradhan et al13 conducted a prospective study of OSA prediction in 150 patients. They adjusted the cutoff values in the clinical model (containing age, gender, loud snoring, and BMI) to have a sensitivity of 100% for the diagnosis of OSA. The prevalence of OSA was 57% (AHI > 10/h). They also developed a model that included overnight home oximetry. They found that it was more cost effective to pursue a clinical screening strategy than a clinical strategy with oximetry. The use of oximetry would have decreased the number of PSG by 13%, compared to an 8% reduction for clinical screening alone, but this savings was eliminated by the additional cost of home monitoring. Sériès and colleagues26 used a qualitative analysis of home oximetry for screening in 240 patients referred for assessment of OSA. They reported a sensitivity of 98% and a specificity of 48% (PPV, 61%; NPV, 97%) in a population with a prevalence of 46%. The two patients who were misclassified as not having OSA had very low AHI (14/h and 16/h, respectively). Oximetry had a high sensitivity for the diagnosis of OSA in that study and could have reduced the number of PSG studies performed by approximately 25%. However, it is more time consuming and expensive to organize and interpret a home study than it is to use a simple clinical prediction model such as a neural network in the office setting.
This study differs from other screening studies because of the use of
the neural network. If the neural network could accurately rule in or
rule out OSA, then the PSG could be eliminated from the diagnostic
assessment of some non-OSA patients, thereby saving valuable resources,
potentially from some OSA subjects who might proceed to a therapeutic
study (continuous positive airway pressure trial) instead of a
diagnostic study. Given that OSA has significant consequences,
physicians would not want to use a screening tool that missed patients
with significant sleep-disordered breathing. In this study, the
sensitivity for the diagnosis of OSA was 98.9% (95% CI, 96.7 to 100).
The most important measure of the neural network as a screening
instrument is the high sensitivity coupled with the low false-negative
rate. The neural network misclassified only 1 out of 150 cases as not
having OSA when the AHI was
10/h. This subject had an AHI of 10.5
events per hour, a very low level of OSA. This suggests that the neural
network does not miss serious cases of OSA when it does make a mistake.
Only 12 cases without OSA were classified as having OSA and therefore
underwent potentially unnecessary PSG testing. Overall, 48 patients
(32%) would not have required PSG based on the neural network
prediction. Although a specific cost analysis was not performed in this
study, it is apparent that the number of PSG studies could be reduced
if the network could accurately rule in and rule out OSA. These
patients could have been reassured that they did not have significant
OSA; potentially, they could have preceded directly to treatment based
on their risk factors and symptoms. Ultimately, this approach may
increase the number of patients evaluated for the possibility of OSA,
while reserving PSG for patients who are more likely to have
significant sleep-disordered breathing. In centers with long waiting
lists, this could also reduce waiting times for evaluation.
The limitations to this study include the retrospective nature of the data review, the lack of prospective validation, and the relatively small numbers. The model performed exceedingly well, even with the small numbers; however, in general, the performance of neural networks improves with increases in the size of the data set. Future work with this prediction technique will include expansion of the training set and validation in a larger set of consecutively and prospectively collected cases. This model was developed in a population with a fairly high prevalence of OSA (referral to a sleep clinic; prevalence, 69%), and it needs to be tested in a population with a lower prevalence. This will help determine if the neural network model that was developed in this study is generalizable to other patient populations. Potentially, a neural network could be developed for children with OSA, as PSG is more difficult to perform in this specific population.
| Conclusion |
|---|
|
|
|---|
| Footnotes |
|---|
Abbreviations: AHI = apnea plus hypopnea index; ANN = artificial neural network; BMI = body mass index; CI = confidence interval; GRNN = generalized regression neural network; NC = neck circumference; NPV = negative predictive value; OSA = obstructive sleep apnea; PPV = positive predictive value; PSG = polysomnography
Received for publication May 7, 1998. Accepted for publication March 2, 1999.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
L. K. Brown Cephalometric Measurements and Sleep Apnea Hypopnea Syndrome Chest, September 1, 2002; 122(3): 765 - 768. [Full Text] [PDF] |
||||
![]() |
N. Roche, B. Herer, C. Roig, and G. Huchon Prospective Testing of Two Models Based on Clinical and Oximetric Variables for Prediction of Obstructive Sleep Apnea Chest, March 1, 2002; 121(3): 747 - 752. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |