Research Article
Austin J Emergency & Crit Care Med. 2015;2(2): 1017.
How Information Technologies Support the Intensive Care Systems: An Application of Mortality Prediction Model with Support Vector Machine
Chien-Lung Chan1, Hsien-Wei Ting1,2 and Chia-Li Chen1,3*
1Department of Information Management, Yuan Ze University, Taiwan
2Department of Neurosurgery, Taipei Hospital, Taiwan
3Department of Information Management, Lung Hwa University, Taiwan
*Corresponding author: Chia-Li Chen, Department of Information Management, Lung Hwa University, No.300, Sec.1, Wanshou Rd., Guishan Shiang, Taoyuan County 33306, Taiwan
Received: January 11, 2015; Accepted: March 06, 2015 Published: March 10, 2015
Abstract
Background and Objective: Intensive care is very important in modern health care. Mortality prediction models are good outcome predictors for intensive care and resources allocation. Many research used the information technologies to construct new mortality prediction models. This study used the Support Vector Machine (SVM) to construct a better mortality prediction model.
Methods: This study collected 695 patients (230 women and 465 men) who were admitted to the surgical intensive care unit in a 600-bed hospital as training data from January 1, 2005 to December31, 2006. Among the 695 patients, 538 (77.41%) patients were alive and 157 were dead (22.59%). This study selected the Gaussian RBF kernel to build a mortality prediction model with empirical data. All variables were included in this model.
Results: The precision rate, recall rate and F-Measure of the SVM model were 0.899, 0.902 and 0.899, respectively. The area under ROC curve (AUR) of models was calculated. The SVM model (AUR=0.932) is better than SAPS II (AUR=0.883) and APACHE II (AUR=0.885) (p<0.01).
Conclusion: The SVM can manage the twin peaks phenomenon which is one of the characteristics of health or medical data.
Keywords: Acute Physiology and Chronic Health Evaluation System, 2nd version (APACHE II); Decision support system; Intensive care; Medical decision making; Mortality prediction; Simplified Acute Physiology System, 2nd version (SAPS II); Support Vector Machine
Introduction
Intensive care is very important in modern health care [1] and the outcome evaluation for intensive care can help to make decisions regarding intensive care facilities [2,3]. Some researchers used mortality prediction models as outcome predictors for intensive care. Popular mortality prediction models include the Acute Physiology and Chronic Health Evaluation System, 2nd version (APACHE II) [4], 3rd version (APACHE III) [5], and 4th version (APACHE IV) [6]; the Simplified Acute Physiology System, 2nd version (SAPS II) [7] and 3rd version [8]; and the Mortality Probability Model, 2nd version (MPM II) [9] and 3rd version (MPM III) [10]. These models are good outcome prediction models for intensive care [11], and are general mortality prediction models for any kind of patient admitted to an intensive care unit. Some mortality prediction models are constructed for a special purpose. For example, the Multiple Organ Dysfunction Score (MODS), Sequential Organ Failure Assessment (SOFA) score and Sepsis-related Organ Failure Assessment are frequently used to assess the outcome of sepsis or multiple organ failure [12-17]. These models are constructed for different purposes.
The first mortality prediction model was constructed by McCabe and Jackson. They collected 173 septicemia patients and divided patients into nonfatal, ultimately fatal and fatal groups. This model may calculate the tendency of mortality [18]. Cullen et al. [19] collected 70 variables of patients and scored these variables from 1 to 4 in terms of severity. They evaluated the severity of patients by summing the collected scores. This is called the Therapeutic Intervention Scoring System. The Glasgow Coma Scale (GCS) is also a severity tendency model [20,21]. Recently, some researchers have simplified the GCS for outcome evaluation, and the simplified models are as good in terms of mortality prediction as the GCS [22]. These models focus on the tendency of mortality, and are pure scoring systems rather than probability systems. Some models constructed based on the probabilities and statistical methodologies. APACHE II and SAPS II are the two most popular models [4,5,7,23]. These models are constructed with Probity regression and use probabilities as the outcome description of mortality. The 2nd version of the Mortality Probability Model (MPM II) is also an intensive care unit (ICU) outcome prediction model with probabilities [9] and new versions of these models have been constructed in recent studies [5,6,8]. Recently, researchers have constructed some mortality prediction models using artificial intelligence technologies [1,17,23-25,50]. These models can add to the health information system as an intensive care facilities decision support system and improve the quality of medical care and facilities allocation.
There are some important characteristics of medical data to consider. One of the most important characteristics is the twin peaks phenomenon. For example, the “within normal range” systolic blood pressure (SBP) is from 90 mmHg to 140 mmHg. We defined hypertension as a SBP greater than 140 mm Hg. Sometimes patients are defined as hypertension if their SBP is less than 90 mmHg [26,27]. Most laboratory or physiological data got normal range of data. It means that the data “within normal range” present that the result of this data is good. Patient with extreme data results will be worse than the patients who have data within normal range. Therefore, if we want to predict the tendency of mortality using these data, the twin peaks phenomenon needs to be solved. APACHE II and SAPS II solved this problem by ranking the raw data and summarizes them into one score as the tendency of mortality or probabilistic models [4,7]. Although many researchers have attempted to improve the accuracy of these mortality models, the definitions of ranking are still constructed subjectively.
Researchers solved medical problems with artificial intelligence technologies. These technologies are usually referred to as classification methodologies. Among the classification methods, the logistic regression (LR) method is one of the most popular methodologies for classification. LR uses a statistic method and builds a logical classification tool. Most studies have used artificial neural networks (ANNs) as the classification method for medical problems, even for mortality prediction in patients in intensive care units [23,28,29]. Muniz et al. [30] evaluated the effect of sub thalamic stimulation in Parkinson disease with probabilistic ANNs, Support Vector Machines (SVMs) and logistic regression models, and concluded that the ANNs are better than other models in this research. The accuracy of ANNs is influenced not only by the numbers of input nodes and the numbers of hidden nodes: large scale data are also required for training models [23]. Unfortunately, the logic of hidden layer cannot be explained.
Decision trees are another kind of classification technology and are more logical in terms of presentation than ANNs. These methods use statistic difference and/or entropy difference as a base to find the best decision nodes and trees [31]. The modeling may be applied to decision support systems and to manage data easily for problemsolving [31,32]. Ting et al. [31] modified the Alvarado scoring system using C5.0 and constructed a better decision model for acute appendicitis diagnosis and improved the misdiagnosis rate. Abu- Hanna et al. [24] combined decision trees with logistic regression and improved the evaluation power of intensive care prognosis. However, the decision tree methods are hard to manage the twin peaks phenomenon, which is one of the characteristics of health or medical data.
Support Vector Machine (SVM) was first proposed by Vapnik in 1995 [33]. It is a kind of information technology which used on the problems of classification. The method uses both statistical learning and structural risk minimization to find an optimal separation hyperplane which can separate different class outcome in a multidimensional space [33,34]. SVM uses both statistical learning and structural risk minimization (SRM) to find an optimal separation hyperplane which can separate different class outcome in a multidimensional space [34-36]. It has been used on many problems in different fields, included text categorization; image recognition; face detection; voice recognition; genetic classification and medical diagnostic problems [34,37-40]. Zhu et al. [41] constructed a SVMbased classifiers. It may has better performance to evaluate the pulmonary nodules found in computer tomography are malignant or not. Yamamoto et al. [40] also used the SVM technology to identify the possible multiple sclerosis lesions correctly in the brain magnetic resonance images. Verplancke et al. [25] constructed a novel mortality prediction model for hematological malignancies patients with SVM, which was better than logistic regression. It is good method for classification and they improved many classification problems of medical fields.
Based on the fitness of kernels distributions, the SVM is one of the classification models that can be used to manage this problem. A new mortality prediction model using SVM technology was constructed for the patients who are admitted to ICU in this study.
Methodology
SVM model proposed by Vapnik [33] is an effective classification method and used in many different fields, including text categorization; image recognition; face detection; voice recognition; genetic classification and medical diagnostic problems [34,37-40]. It uses both statistical learning and structural risk minimization (SRM) to find an optimal separation hyperplane, which can separate different class outcomes in a multi-dimensional space [34-36]. Proper parameter selection can improve the classification accuracy of the SVM model. We describe some concepts of support vector machines below. Given training data xi ,i =1,..., n yi∈{1,i1} the SVM requires the solution of the following optimization problem [42]: ξi are slack variables used to tolerate the classification condition which cannot separate linearly (Figure 1).
Figure 1: Sets are not linearly separable.
A very important factor to improve the accuracy of the model for SVM models is penalty parameter C. The parameter C presents the degree of punishment and adjust on the models [43]. Proper tuning of parameter C is very crucial [44]. Another important factor for the accuracy improvement in SVM is kernel function mapping. Usually, we use kernel functions (Φ) to map xi into a higher dimensional space. The SVM finds the linear separating hyperplane with the maximal margin in higher dimensional space (Figure 2).
Figure 2: Kernel functions mapping to feature space.
Different kernel functions have their own different surfaces. The training vectors of SVM are mapped into higher dimensional space using kernel functions to map to feature space. A linear separating hyperplane with the maximal margin is obtained [11,34-36]. Selection of the proper kernel is very important. Different kernel functions have their own different surfaces and distributions. It will influence the accuracy of the SVM. Many kernels can be chosen from. The four most popular kernels of SVM, which are used in many fields, are as follows [42,43]:
- Gaussian radial basis function (RBF): K(xi; xj) = exp(- γ?xi - xj?²), γ> 0.
- Linear kernel: K(xi; xj) = xi Txj.
- Polynomial of degree d: K(xi; xj) = (γxi Txj + r)D, γ> 0.
- Sigmoid kernel: K(xi; xj) = tanh(γxi Txj + r).
This study used Rapid Miner version 5.1.001 and it is a kind of free software for data mining [45]. We tried the mortality prediction data set to choose the best kernel first. Every kernel has a specialty for classification. Researchers concluded that the RBF kernel is better than other kernels for classification. It can handle the situation when the relationship between class labels and attributes is nonlinear. The polynomial kernel has more hyper parameters than the RBF kernel. The RBF kernel is simpler and faster than the others [42]. This study therefore used the RBF kernel, for which two factors, C (cost parameter) and γ (kernel function), must be selected. A10-fold cross validation procedure was also used to prevent the over-fitting problem. One subset was tested using the classifier trained on the remaining nine subsets sequentially.
The receiver operating characteristic (ROC) curves and the areas under the ROC curves (AUR) of the new model, SAPS II and APACHE II mortality prediction models were calculated and compared. The Wilcoxon signed rank test and the AUR were used to compare the accuracy of the two mortality prediction models. The Wilcoxon signed rank test was performed using SPSS version 12.0 (SPSS Inc., Chicago, IL, USA) and comparison of the significance between the ROC curves was performed using MedCalc Version 9.3.8.0 (MedCalc Software, Mariakerke, Belgium). Significant differences were defined as p<0.05.
Experiment Design
The structured and interactive approach CRISP-DM (Cross Industry Standard Process for Data Ming) provides guidelines for planning a data mining project [51]. This approach consists of six phases: business understanding, data understanding, data preparation, modeling, evaluation and deployment. The six phases are simplified and illustrated in Figure 3.
Figure 3: Simplified CRISP-DM approach.
Business understanding
Intensive care is very important in modern health care and mortality prediction models are used for outcome predictors for intensive care and resources allocation. Many researchers continue to use the information technologies to construct better mortality prediction models. This study used the Support Vector Machine (SVM) to construct an innovative mortality prediction model.
Data understanding and data preparation
This study retrospectively collected training data from 695 patients (230 women and 465 men) who were admitted to the surgical ICU in a 600-bed hospital (12-bed surgical ICU) from January 1, 2005 to December31, 2006. All patients’ data were collected confidentially with randomized codes. The average patient age was 57.26 (SD = 20.41) years. Due to these cases were used for the training data, both the attributes of APACHE II and SAPS II were collected without missing data. The mean scores of APACHE II and SAPS II were 13.6 (SD=7.8) and 37.8 (SD=16.8). The mean probabilities of APACH II and SAPS II were 0.2456 (SD=0.2003) and 0.2698 (SD=0.2618) (Table 1). The age, type of ICU admission (medical disease, elective and emergency surgery) and Glasgow coma scale (GCS) were all included in the attributes. The definition of being alive is that the patients were discharged alive or stayed in hospital for at least 30 days. The definition of being dead is that the patients died before being discharged from the hospital. The mean scores and mortality probabilities of the dead group were significant higher than the alive group (p<0.001), no matter the APACHE II or SAPS II prediction systems (Table 1). A new model was constructed containing the recorded data of all of the patients.
Variable
Alive (SD)
Dead (SD)
Total (SD)
p value
APACHE II***
11.0 (5.6)
22.6 (7.2)
13.6 (7.8)
0.000
Probability (APACHE II)***
0.1763 (0.1313)
0.4828 (0.2142)
0.2456 (0.2003)
0.000
SAPS II***
32.2 (12.1)
57.2 (16.3)
37.8 (16.8)
0.000
Probability (SAPS II)***
0.1779 (0.1739)
0.5845 (0.2687)
0.2698 (0.2618)
0.000
Table 1: The scores and probabilities prediction of APACHE II and SAPS II.
Among the 695 patients from whom training data were collected, 538 (77.41%) patients were alive and 157 were dead (22.59%). Among these patients, 238 (34.24%) did not undergo surgery, 153 (22.01%) underwent emergency surgery, and 304 (43.74%) underwent elective surgery. The emergency surgical patients died more frequently than they remained alive. There was a significant difference in surgical status between the alive group and the dead group (p < 0.001) (Table 2).
Surgical status
Alive
Dead
Sum
No operation
144
94
238
Elective surgery
271
33
304
Emergency surgery
123
30
153
Sum
538
157
695
Table 2: Demographic data of surgical status.
Eighteen variables related to patients’ laboratory and physiological data were collected. Twelve variables of patient data differed significantly between the dead and alive groups. These variables were systolic and diastolic blood pressure; heart rate; urine output; sodium; blood urea nitrogen (BUN); creatinine; blood sugar; haematocrit; arterial pH; bicarbonate; and Glasgow coma scale (GCS). The GCS included 3 components. The other 6 variables did not differ significantly between the dead and alive groups. The standard deviations (SDs) of the variables in the dead patients were all larger than the SDs of the variables in the alive patients, with the exception of age. This means that the data variation of the dead patients was larger than the data variation of the Alive patients (Table 3). The SD of age did not differ significantly between the Alive group and the Dead group. The demographic data of the Glasgow coma scale (GCS) are shown in Table 4. There were significant differences in the GCS between the alive and dead patients (p < 0.001) (Table 4).
Variable
Alive (SD)
Dead (SD)
Total (SD)
p value
Temperature (°C)
36.60 (1.46)
36.21 (2.58)
36.52 (1.78)
0.070
Systolic BP (mmHg) *
140.8 (30.4)
131.6 (49.0)
138.7 (35.6)
0.027
Diastolic BP (mmHg) **
76.9 (19.6)
70.6 (24.7)
75.5 (21.0)
0.004
Heart rate (per min) ***
96.8 (20.6)
113.1 (32.6)
100.5 (24.8)
0.000
Breath rate (per min)
19.7 (5.5)
19.6 (7.0)
19.7 (5.8)
0.839
Urine output (c.c.) ***
2780 (2833)
5449 (8151)
3383 (4732)
0.000
Bilirubin (mg/dL)
0.44 (1.47)
0.48 (1.53)
0.45 (1.48)
0.743
Sodium (mEq/L) ***
140.6 (5.9)
145.7 (14.9)
141.8 (9.0)
0.000
Potasium (mEq/L)
3.80 (0.64)
3.86 (1.00)
3.81 (0.74)
0.487
BUN (mg/dL) ***
15.68 (12.28)
26.34 (27.82)
18.09 (17.62)
0.000
Creatinine (mg/dL) ***
1.1 (0.9)
1.6 (1.5)
1.2 (1.1)
0.000
Sugar (mg/dL) ***
142.0 (63.8)
168.6 (96.3)
148.0 (73.2)
0.001
Hematocrit (%) ***
34.0 (6.0)
30.8 (7.4)
33.3 (6.5)
0.000
WBC (/µL)
11126 (4961)
11757 (6138)
11268 (5252)
0.239
Arterial pH **
7.408 (0.070)
7.367 (0.156)
7.398 (0.098)
0.002
HCO3 (mEq/L) ***
23.37 (4.07)
20.52 (5.70)
22.73 (4.64)
0.000
PaO2
144.3 (79.6)
153.4 (103.9)
146.3 (85.7)
0.311
Age *
56.43 (20.62)
60.10 (19.46)
57.26 (20.41)
0.042
* p<0.05 ** p<0.01 *** p<0.001
BUN: blood urine nitrogen; WBC: white blood cell; HCO3: bicarbonate anion; PaO2: partial pressure of oxygen in arterial blood
Table 3: Attributes of the training data.
Score
GCSE
GCSV
GCSM
Alive
Dead
Alive
Dead
Alive
Dead
1
69
103
214
133
34
65
2
39
19
16
4
16
29
3
132
10
15
1
13
7
4
298
25
37
3
32
13
5
X
X
256
16
115
21
6
X
X
X
X
328
22
GCSE: the eye component of the GCS; GCSV: the verbal component of the GCS; GCSM: the motor component of the GCS
Table 4: Demographic data of the GCS.
Modeling, evaluation and deployment
This study selected the Gaussian RBF kernel with Lib SVM in Rapid Miner 5.1.001to build a mortality prediction model with empirical data. All variables were included in this model. The most important factor was BUN (13.08%); the second most important factor was the motor component of the GCS (GCSM) (11.18%); the third factor was surgical status (10.77%); and the fourth was sodium (10.27%). The weights of all of these four factors are count as more than 10%, respectively. Temperature, WBC and gender were not so important in this model, all of their weights are less than 0.005 (Figure 4).
Figure 4: Weights of the SVM model with the RBF kernel.
The precision rate, recall rate and F-Measure of the SVM model were 0.899, 0.902 and 0.899, respectively. The area under the ROC curve (AUR) of the SVM model was 0.932. The AUR of the SAPS II and APACHE II are 0.885 and 0.883, respectively. The new model is better than SASP II and APACHE II (p<0.01) (Figure 5).
Figure 5: ROC curves of the APACHE II, SAPS II and SVM models.
Conclusion
The meaning of “within normal range” is that these data are normal and good for the health. Patient with extreme laboratory or physiological data are bad for the health than the patients who have “within normal range” data. There are special characteristic in medical data, the twin peaks phenomenon. For example, the “within normal range” systolic blood pressure (SBP) is from 90 mmHg to 140 mm Hg. We defined hypertension as a SBP greater than 140 mmHg. Sometimes patients are defined as hypertension if their SBP is less than 90 mm Hg [26,27]. These definitions presented that the twin peaks phenomenon and most laboratory or physiological data got normal range of data. This phenomenon is very important if managing the medical data and data mining. The APACHE system and SAPS system consider the twin peaks phenomenon of medical data and constructed the new versions of models [4-6,8]. Unfortunately, they are constructed with subjective experiences and classifications. These methods may lose some information after transformation. Chan and Ting [1] constructed a novel mortality prediction model using linear regression, a genetic algorithm and Bayesian theory. Their model managed medical data with a mention of the twin peaks phenomenon. Unfortunately, the cut-off point for normal and abnormal data is still manually obtained. The present research used the Support Vector Machine (SVM) to construct a new mortality prediction model. The SVM managed raw medical data with their own characteristics automatically and easily, and may find the optimal cut-off point and weighting in these medical data automatically.
Most researchers solved medical problems with computers aids and decision support systems (DSS). The electronic medical record offers good resources for medical data. It is an advantage for DSS construction [46]. It can save many efforts to collect these data if the information infrastructures were all available. In contrast, these new DSS are one of the crucial components in the electronic medical record or electronic chart system. They can construct further values for the
electronic medical record or electronic chart systems. Some models can also prevent medication errors and improved the health quality [47]. Gupta [48] construct electronic clinical decision support system to reduce this risk of epidural hematoma in some bleeding tendency patients. This study constructed a novel mortality prediction model with medical records. In the modern researches, the new versions of prediction models are more accurate but more complicated. APACHE mortality prediction model is a good example. Compared with the APACHE IV, it collects 142 variables (including 115 disease groups), the APACHE II just only collected for 12 variables and some chronic diseases items [6]. The cost of information collection increased for the new prediction models. Instead of new version of mortality prediction models, for example, the APACHE IV, this study used the variables of old version and constructed with SVM. Although the algorithm is complicated, they are more accurate than previous models and are very convenient for the medical data collection then new versions. Chan and Ting [1] constructed a new mortality prediction model with a genetic algorithm. The SVM model also finds the optimal weights with re-arranges the importance of variables. Different from the GA model, SVM presents a better way of finding the optimal weight and may explain the relationship between the model and the variables.
The standard deviations of the variables in the dead patients were all larger than the standard deviations of the variables in the alive patients, with the exception of the Glasgow coma scale (GCS) and age. The characteristics of GCS and age were different, because there is a positive correlation between these variables and the probability of mortality. The twin peaks distribution of the dead group will cause the SDs of the variables to be larger than those in the alive group. The standard deviation is an indicator of whether a variable is of a twin peaks distribution or not. The kernel fitness will influence the results of classification. Selection of an optimal kernel for medical problems is very important. Many researchers have modified kernels to improve the classification power of the SVM [43,49]. Some classification systems with a SVM solve problems using the Gaussian RBF kernel. This kernel may solve many problems, including medical problems [38]. This study found that the Gaussian RBF kernel is better than other selected kernels owing to the feature fitness between the distribution of the Gaussian RBF kernel and the medical data. The Gaussian RBF kernel may be chosen if the data showed the twin peaks phenomenon. INTcare is an Intelligent Decision Support System (IDSS) for Intensive Care Medicine [52]. We will apply our model to develop a system like INTcare in the future.
References
- Chan CL, Ting HW. Constructing a novel mortality prediction model with Bayes theorem and genetic algorithm. Expert Syst Appl. 2011; 38: 7924-7928.
- Fueglistaler P, Amsler F, Schuepp M, Fueglistaler-Montali I, Attenberger C, Pargger H, et al. Prognostic value of Sequential Organ Failure Assessment and Simplified Acute Physiology II Score compared with trauma scores in the outcome of multiple-trauma patients. Am J Surg. 2010; 200: 204-214.
- Abu-Hanna A, Lucas PJ. Prognostic models in medicine. AI and statistical approaches. Methods Inf Med. 2001; 40: 1-5.
- Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985; 13: 818-829.
- Knaus WA. APACHE 1978-2001: the development of a quality assurance system based on prognosis: milestones and personal reflections. Arch Surg. 2002; 137: 37-41.
- Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today's critically ill patients. Crit Care Med. 2006; 34: 1297-1310.
- Le Gall JR, Lemeshow S, Saulnier F. A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study. JAMA. 1993; 270: 2957-2963.
- Ledoux D, Canivet JL, Preiser JC, Lefrancq J, Damas P. SAPS 3 admission score: an external validation in a general intensive care population. Intensive Care Med. 2008; 34: 1873-1877.
- Lemeshow S, Teres D, Klar J, Avrunin JS, Gehlbach SH, Rapoport J, et al. Mortality Probability Models (MPM II) based on an international cohort of intensive care unit patients. JAMA. 1993; 270: 2478-2486.
- Higgins T, Teres D, Copes W, Nathanson B, Stark M, Kramer A. Assessing contemporary intensive care unit outcome: An updated Mortality Probability Admission Model (MPM0-III). Critical Care Medicine. 2007; 35: 827-835.
- Ohno-Machado L, Resnic FS, Matheny ME. Prognosis in critical care. Annu Rev Biomed Eng. 2006; 8: 567-599.
- Peres Bota D, Melot C, Lopes Ferreira F, Nguyen Ba V, Vincent JL. The Multiple Organ Dysfunction Score (MODS) versus the Sequential Organ Failure Assessment (SOFA) score in outcome prediction. Intensive Care Med. 2002; 28: 1619-1624.
- Troskot R, Šimurina T, Žižak M, Majstorović K, Marinac I, Mrakovćić-Šutić I. Prognostic value of venoarterial carbon dioxide gradient in patients with severe sepsis and septic shock. Croat Med J. 2010; 51: 501-508.
- Wang H, Ye L, Yu L, Xie G, Cheng B, Liu X, et al. Performance of sequential organ failure assessment, logistic organ dysfunction and multiple organ dysfunction score in severe sepsis within Chinese intensive care units. Anaesth Intensive Care. 2011; 39: 55-60.
- Gortzis LG, Sakellaropoulos F, Ilias I, Stamoulis K, Dimopoulou I. Predicting ICU survival: a meta-level approach. BMC Health Serv Res. 2008; 8: 157.
- Strand K, Flaatten H. Severity scoring in the ICU: a review. Acta Anaesthesiol Scand. 2008; 52: 467-478.
- Toma T, Abu-Hanna A, Bosman RJ. Discovery and integration of univariate patterns from daily individual organ-failure scores for intensive care mortality prediction. Artif Intell Med. 2008; 43: 47-60.
- McCabe W, Jackson G. Gram negative bacteremia: I. etiology and ecology. Arch Internal Medicine. 1962; 110: 847-855.
- Cullen D, Cirvetta J, Briggs B, Ferrara L. Therapeutic intervention scoring system: A method for quantitative comparison of patient care. Crit Care Med. 1974; 2: 57-60.
- Ting HW, Chen MS, Hsieh YC, Chan CL. Good mortality prediction by Glasgow Coma Scale for neurosurgical patients. J Chin Med Assoc. 2010; 73: 139-143.
- Jennet B, Teasdale G, Braakman R, Minderhoud J, Heiden J, Kurze T. Prognosis of patients with severe head injury. Neurosurgery. 1979; 4: 283-289.
- Gill M, Windemuth R, Steele R, Green SM. A comparison of the Glasgow Coma Scale score to simplified alternative scores for the prediction of traumatic brain injury outcomes. Ann Emerg Med. 2005; 45: 37-42.
- Silva A, Cortez P, Santos MF, Gomes L, Neves J. Mortality assessment in intensive care units via adverse events using artificial neural networks. Artif Intell Med. 2006; 36: 223-234.
- Abu-Hanna A, de Keizer N. Integrating classification trees with local logistic regression in Intensive Care prognosis. Artif Intell Med. 2003; 29: 5-23.
- Verplancke T, Van Looy S, Benoit D, Vansteelandt S, Depuydt P, De Turck F, et al. Support vector machine versus logistic regression modeling for prediction of hospital mortality in critically ill patients with haematological malignancies. BMC Med Inform Decis Mak. 2008; 8: 56.
- Brott T, Thalinger K, Hertzberg V. Hypertension as a risk factor for spontaneous intracerebral hemorrhage. Stroke. 1986; 17: 1078-1083.
- Bruns B, Gentilello L, Elliott A, Shafi S. Prehospital hypotension redefined. J Trauma. 2008; 65: 1217-1221.
- Pearl A, Bar-Or R, Bar-Or D. An artificial neural network derived trauma outcome prediction score as an aid to triage for non-clinicians. Stud Health Technol Inform. 2008; 136: 253-258.
- Pearl A, Caspi R, Bar-Or D. Artificial neural network versus subjective scoring in predicting mortality in trauma patients. Stud Health Technol Inform. 2006; 124: 1019-1024.
- Muniz AM, Liu H, Lyons KE, Pahwa R, Liu W, Nobre FF, et al. Comparison among probabilistic neural network, support vector machine and logistic regression for evaluating the effect of subthalamic stimulation in Parkinson disease on ground reaction force during gait. J Biomech. 2010; 43: 720-726.
- Ting HW, Wu JT, Chan CL, Lin SL, Chen MH. Decision model for acute appendicitis treatment with decision tree technology--a modification of the Alvarado scoring system. J Chin Med Assoc. 2010; 73: 401-406.
- Trujillano J, Badia M, Serviá L, March J, Rodriguez-Pozo A. Stratification of the severity of critically ill patients with classification trees. BMC Med Res Methodol. 2009; 9: 83.
- Vapnik VN. The nature of statistical learning theory, 2 edn. Springer. New York. 1999.
- Matheny ME, Resnic FS, Arora N, Ohno-Machado L. Effects of SVM parameter optimization on discrimination and calibration for post-procedural PCI mortality. J Biomed Inform. 2007; 40: 688-697.
- Mavroforakis ME, Georgiou HV, Dimitropoulos N, Cavouras D, Theodoridis S. Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif Intell Med. 2006; 37: 145-162.
- Nilsson J, Ohlsson M, Thulin L, Höglund P, Nashef SA, Brandt J. Risk factor identification and mortality prediction in cardiac surgery using artificial neural networks. J Thorac Cardiovasc Surg. 2006; 132: 12-19.
- Takeuchi K, Collier N. Bio-medical entity extraction using support vector machines. Artif Intell Med. 2005; 33: 125-137.
- Zhang XP, Wang ZL, Tang L, Sun YS, Cao K, Gao Y. Support vector machine model for diagnosis of lymph node metastasis in gastric cancer with multidetector computed tomography: a preliminary study. BMC Cancer. 2011; 11: 10.
- Roshan U, Chikkagoudar S, Wei Z, Wang K, Hakonarson H. Ranking causal variants and associated regions in genome-wide association studies by the support vector machine and random forest. Nucleic Acids Res. 2011; 39: e62.
- Yamamoto D, Arimura H, Kakeda S, Magome T, Yamashita Y, Toyofuku F, et al. Computer-aided detection of multiple sclerosis lesions in brain magnetic resonance images: False positive reduction scheme consisted of rule-based, level set method, and support vector machine. Comput Med Imaging Graph. 2010; 34: 404-413.
- Zhu Y, Tan Y, Hua Y, Wang M, Zhang G, Zhang J. Feature selection and performance evaluation of support vector machine (SVM)-based classifier for differentiating benign and malignant pulmonary nodules by computed tomography. J Digit Imaging. 2010; 23: 51-65.
- Hsu CW, Chang CC, Lin CJ. A Practical Guide to Support Vector Classification. 2014.
- Huang CL, Wang CJ. A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl. 2006; 31: 231-240.
- Yuan R, Li Z, Guan X, Xu L. An SVM-based machine learning method for accurate internet traffic classification. Inf Syst Front. 2008; 12: 149-156.
- Mierswa I, Klinkenberg R. RapidMiner. 2014.
- Lakshminarayan K, Rostambeigi N, Fuller CC, Peacock JM, Tsai AW. Impact of an electronic medical record-based clinical decision support tool for dysphagia screening on care quality. Stroke. 2012; 43: 3399-3401.
- Parsons A, McCullough C, Wang J, Shih S. Validity of electronic health record-derived quality measurement for performance monitoring. J Am Med Inform Assoc. 2012; 19: 604-609.
- Gupta RK. Using an electronic clinical decision support system to reduce the risk of epidural hematoma. Am J Ther. 2014; 21: 327-330.
- Smith GF, Jordan EM. Improved SVM Regression using Mixtures of Kernels. International Joint Conference on Neural Networks. 2002; 2785-2790.
- Kim S, Kim W, Park RW. A Comparison of Intensive Care Unit Mortality Prediction Models through the Use of Data Mining Techniques. Healthc Inform Res. 2011; 17: 232-243.
- Portela F, Santos MF, Silva A, Rua F, Abelha A, Machado J. Preventing patient cardiac arrhythmias by using Data Mining Techniques. 2014.
- Gago P, Santos MF, Silva A, Cortez P, Neves J, Gomes L. INTCare: a knowledge discovery based intelligent decision support system for intensive care medicine. J Decision Systems. 2005; 14: 241-259.