|
|
||||||||
Guest Access | Sign In via User Name/Password |
|||||||||
* From the Barzilai Medical Center (Drs. Bibi and Shoseyov, and Mr. Peled), Ashkelon, Israel; Tel-Aviv Technical College (Mr. A. Nutman and Mr. Shalom), Tel-Aviv University, Tel-Aviv, Israel; Allergy Clinic (Dr. Kivity), Soraski Medical Center, Tel Aviv, Israel; and Department of Pediatrics (Dr. J. Nutman), The Childrens Medical Center of Overlook, Summit, NJ.
Correspondence to: Haim Bibi, MD, Barzilai Medical Center, Ashkelon, Israel 78306; e-mail: haim_76407{at}yahoo.com
| Abstract |
|---|
|
|
|---|
Design and setting: To predict ED visits, we have created a computer-based model called an artificial neural network (ANN) using a back-propagation training algorithm and genetic algorithm optimization. This ANN was fed meteorologic and air pollution input variables and trained to predict the number of patients admitted to the ED with respiratory symptoms of asthma, COPD, and acute and chronic bronchitis on the corresponding day. One thousand twenty data sets were extracted from an ED admittance database at the Barzilai Medical Center (Ashkelon, Israel), and randomized to a network training set (n = 816) and a test set (n = 204).
Results: The neural network performed best when the predictor variables used were temperature, relative humidity, barometric pressure, SO2, and oxidation products of nitric oxide, and the data presented as peak value 24 h prior to ED admission and the average during the 7 days before the ED visit. The neural network was able to predict the test set with an average error of 12%.
Conclusion: Based on meteorologic and pollution data, the use of an ANN can assist in the prediction of ED visits related to respiratory conditions.
Key Words: artificial neural networks emergency department respiratory symptoms
| Introduction |
|---|
|
|
|---|
Because the exacerbations are caused by multiple stimuli, accurate prediction of an exacerbation is difficult. To our knowledge, there is no reliable way to predict these exacerbations. We attempted to develop a mathematical model for predicting respiratory symptom exacerbations using neural network technology.
Artificial neural networks (ANNs) are computer-based algorithms inspired by the structure and behavior of real neurons. Like the brain, they can recognize patterns, reorganize data and, most appealing, and learn from experience. The ANN consists of a set of processing units that simulate neurons and are interconnected via a set of weights in a way that allows signals to travel in parallel as well as serially.9 Of several possible ways to train an ANN, one of the most successful is the back-propagation algorithm.10 11
The purpose of this study was to model the relationship between atmospheric changes, including air pollution, and ED visits for respiratory conditions using an ANN. The ANN was trained to predict the number of ED visits for respiratory symptoms due to asthma, COPD, and acute and chronic bronchitis per day based on the input of meteorologic and air pollution data.
| Materials and Methods |
|---|
|
|
|---|
The Barzilai Medical Center has a large database of ED admittance records collected prospectively between January 1992 and December 1995. The records include demographic data: age, sex and address, and medical information, including symptoms and diagnosis defined by the ED physician and later confirmed by a medical secretary. The diagnosis was based on symptoms, physical examination findings, laboratory data including chest radiography, oxygen saturation measured by pulse oximetry, ECG, and response to treatment.
Group diagnoses can be classified into six main categories: (1) cardiac, including coronary artery disease (myocardial infarction, anginal pains), myocarditis, endocarditis, arterial hypertension, congestive heart failure, arrhythmias, and congenital cardiac malformations; (2) allergy, any kind of allergic response to food, animal, pollen, insect bite, or venom; (3) ear, nose, and throat disease, including tonsillitis, chronic ear nose and throat, otitis media or externa, sinusitis, rhinitis, peritonsillar abscess, and epistaxis; (4) asthma, including known disease, wheezing, and dyspnea responsive to bronchodilators and corticosteroids; (5) other respiratory disease, including pneumonia, COPD, acute and chronic bronchitis, upper respiratory tract infections, bronchiectasis, and pleural disease, and cough not defined as asthma; and (6) other, including all other visits, such as accidents, trauma, gynecology, etc. This database includes > 220,000 patient records (Table 1 ).
|
Atmospheric and Air Pollution Data
Data collected on air pollution (CO, oxidation products of nitric oxide [NOx], SO2, total suspended particles [TSP], and O3) and meteorologic conditions (temperature, relative humidity, precipitation, barometric pressure, solar radiation, and wind speed and direction) were recorded by seven monitoring stations in 30-min intervals. The monitoring stations were scattered evenly throughout the Ashkelon area. The input data used for prediction was the average of the recordings of all the stations, after we determined that there were no significant differences among them.
ANNs
ANNs are gross models of actual neurons. Like the brain, they can recognize patterns and organize data, but most ominous is an inherent ability to learn. ANNs are typically composed of interconnected objects called units, which represent neurons. The units are connected by links, which act as the axons and dendrites. The link multiplies the output from a unit by a weighting factor, a value analogous to the connection strength at a synapse. The link then passes the weighted output value to another unit, which sums up the values passed to it by all other incoming links (Fig 1
, left). If the total input value exceeds a designated threshold value, the unit fires. Modifications in the firing patterns constitute the learning. This occurs as the weighting factors on the links change.
|
There are several algorithms for ANN training. They can be classified into two groups: supervised and unsupervised learning. In supervised learning, the network is with input and output pairs and the algorithm is set to change the weight on the connections so that the output units give a better approximation of the desired output. In unsupervised learning, the network is presented with input data only and the algorithm is set to modify weights so that the data are categorized into groups based on similarities. We chose a supervised training algorithm called back propagation for network training.
Back Propagation
One of the most successful supervised training methods is the back-propagation algorithm. The basic concept is to use the derivative of an error function in order to find the direction that minimizes the error of the network and updating the weights accordingly.10
The error function most commonly used is the sum-squared error of the output units. The algorithm attempts to minimize the mean error over the entire training set, much as statisticians calculate a best-fit line for scattered points on a graph. We chose the back-propagation algorithm after using it successfully in a previous study.12
Our neural network application was custom programmed based on the conventional back-propagation equations using an Intel C++ Compiler; (Intel; Santa Clara, CA), rather then using commercially available software. This allowed us better control and flexibility during training.
Genetic Algorithms
Construction of an ANN begins with arbitrary decisions concerning the network architecture and other network parameters. These decisions could have substantial effect on network performance. For instance, as more units are added to the hidden layers, the ANN performance on the training data sets will improve; however, if the network is tried on independent data sets, its performance will first improve, then get worse. This phenomenon is called overfitting. It occurs when the ANN models the noise as well as the data in the training data sets. To avoid this and other training dilemmas, the search for the optimal network configuration can be automated using a genetic algorithm (GA),13
a method loosely based on the Darwinian principle of natural selection. The exact number of neurons was decided by the GA, which built the architecture automatically and chose the architecture that worked best.
The algorithm starts with a large random population of data strings called chromosomes representing ANNs. Each locus in the chromosome describes one of the ANN parameters. The population of ANNs undergoes a process of evolution through three steps: selection, crossing over, and mutation. In the selection step, each ANN is ranked according to its performance based on a fitness function. The better performing chromosomes are copied to form a new population, while the weaker ones are eliminated to maintain a constant population size. This equates to survival of the fittest. In the crossing-over step, combining parameters of the fittest ANNs generates new chromosomes. This represents mating between individuals. In the mutation step, random alterations of genes are introduced. This allows new parameter options to be evaluated.
This sequence of events, called a generation, is repeated until the optimal ANN configuration is found. There are many variations on the classical GA. We used a steady-state GA with a population size of 100 chromosomes.
We chose a fitness function that would take into account the ANN performance on both the training data sets (indicating successful training) and independent test data sets (indicating successful generalization). This was done by calculating the average output error for the training data sets multiplied by a factor of 0.6, and adding the average output error for the test data sets multiplied by a factor of 0.4. The reason for using a factor of 0.6 and 0.4, respectively, was to give the ability to correctly learn the training data sets a slight advantage.
Using the GA also allowed selecting which of the monitored meteorologic and air pollution variables are most predictive of ED visits for respiratory symptoms due to asthma, COPD, and acute and chronic bronchitis. This was done by incorporating the input variables used in ANN training into the chromosomes. The input variables were presented in several possible ways: (1) peak value during the 24 h prior to the ED visit, (2) average of values 1 day before ED admittance, (3) average of values during the 3 days before ED admittance, and (4) average of values during the 7 days before ED admittance.
We were interested in predicting the number of ED visits for respiratory symptoms on a particular day. In order to eliminate trends in the total number of visits to the ED, we used a relative index derived by dividing the number of visits for respiratory symptoms by the total number of ED visits for that day.
| Results |
|---|
|
|
|---|
|
No significant correlation between ED visits and atmospheric and air pollution data were determined despite using statistical methods as time-series analysis with trend and seasonal component and generalized additive model on 3-month cohorts. Seasonal variations do exist; however, no correlation with air pollution was found. Although the ANN does not provide a distribution of visits per day, the long period of observation intended to even out seasonal variations in the final analysis.
| Discussion |
|---|
|
|
|---|
NO2 is produced indoors by cooking with gas stoves. NO2 levels outdoors are mainly produced by cars.14 It has been shown that exposure to traffic exhaust has been related to wheezing.15 Exposure to NO2 exacerbated preexisting asthma in a panel of asthmatic volunteers and increased ED admissions for asthma in the acute phase16 and in chronic exposure.17 18
Exposure to SO2 also increases ED visits among patients with COPD.19 Field studies have shown prolonged clinical deterioration during and following episodes of air pollution, especially when patients were exposed to relatively high levels of SO2.20 Other oxidant pollutants such as O3 also increase ED visits due to acute asthma exacerbation.21 Studies have shown that temperature and O3, each separately and combined, worsen asthma symptoms.22 Symptoms become more severe 1 to 3 days after a decline in peak expiratory flow following exposure to changes in temperature.23
Particles, by themselves and in combination with oxidant pollutants, and changes in temperature have been reported to promote an increase in respiratory visits after a 24- to 48-h lag period, especially for asthma attacks.24 Temperature changes and combinations of gaseous pollutants were associated with increased asthma admissions to hospitals.5 25
Weather changes26 have been shown to affect airways, sometimes additionally with an increase of aeroallergens (such as pollens). The effect of acute or chronic exposure to a cold environment has been shown to induce responses in the airways, mainly congestion and bronchoconstriction. Bronchoconstriction, airway congestion, secretions, and decreased mucociliary clearance compromise pulmonary mechanics.27 These responses destabilize the airways, and induce shortness of breath, resulting in the need for medical assistance.
After taking the above-mentioned information into account, we are able to apply it using an ANN. ANNs have been used successfully in medical applications previously.28 29 Moseholm et al30 used neural networks to analyze the effect of weather and air pollution on asthmatic patients. We have already demonstrated that ANN technology can be used for modeling the effect of air pollution on ED visits for asthma, although with limited accuracy.12 Our new study is more accurate compared with the previous study published by our group. The previous study was short term and the analysis was on a smaller group of patients, compared to the longer term, larger population, and more advanced ANN technology with GA optimization of this study.
Most of the above-mentioned studies describing the effect of air pollutants and weather on respiratory symptoms found an effect while accounting for only a few variables. Such limitations are related to the lack of sensitivity of statistical analysis for multiple variables. In an attempt to overcome these limitations, we chose an ANN system.
Our study has investigated the predictive value of meteorologic and air pollution data on ED visits for respiratory symptoms of asthma and cough. The most important variables found were atmospheric temperature, barometric pressure, and relative humidity of the meteorologic data, and the levels of SO2 and NOx.
Changes in precipitation, solar radiation, wind speed and direction, O3 levels, and TSP levels were not found to be predictive by the GA. Input meteorologic and air pollution data were represented as the peak value within 24 h prior to ED admittance, and the average of values during the 7 days before ED admittance. There are many factors, such as indoor pollutants and indoor and outdoor allergens not recorded in our study, that may cause or aggravate the wheezing and shortness of breath and lead to referral to the ED.
The disadvantage in using ANN analysis is that the "black box" nature of neural networks provides little insight into the relative importance of the various input variables used in the model. Also, it is not possible to infer whether the predictors have a positive or negative impact on the output.
Another pitfall of ANNs is overfitting. This occurs when the neural network performs with great accuracy on the training data sets, but poorly on unlearned test data sets. We avoided this problem by including the ANN performance on independent (previously unlearned) data sets in the fitness function of the GA.
The ANN is a nonconventional "flexible" method, and the flexibility can be a disadvantage. However, this method is appealing for the following reasons: (1) it "learns" easily without supervision, (2) the worker does not need computer or mathematical experience, and (3) software is now readily available.
The ANN can assist in detecting trends of meteorologic and air pollutant influence on the airways. These are reflected by ED visits due to respiratory symptoms.
We have shown that an ANN can be used to predict ED visits for respiratory conditions based on meteorologic and air pollution data. The next logical step would be to use neural networks to predict exacerbations in individual patients. Neural network technology is readily available in numerous software packages that physicians could use and custom tailor for their patients.
| Footnotes |
|---|
Received for publication March 7, 2001. Accepted for publication May 21, 2002.
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |