Seongbin Lee , Seunghee Lee , Duhyeuk Chang , Mi-Hwa Song , Jong-Yeup Kim and Suehyun Lee
Explainable Machine Learning Based a Packed Red Blood Cell Transfusion Prediction and Evaluation for Major Internal Medical Condition
Abstract: Efficient use of limited blood products is becoming very important in terms of socioeconomic status and patient recovery. To predict the appropriateness of patient-specific transfusions for the intensive care unit (ICU) patients who require real-time monitoring, we evaluated a model to predict the possibility of transfusion dynamically by using the Medical Information Mart for Intensive Care III (MIMIC-III), an ICU admission record at Harvard Medical School. In this study, we developed an explainable machine learning to predict the possibility of red blood cell transfusion for major medical diseases in the ICU. Target disease groups that received packed red blood cell transfusions at high frequency were selected and 16,222 patients were finally extracted. The prediction model achieved an area under the ROC curve of 0.9070 and an F1-score of 0.8166 (LightGBM). To explain the performance of the machine learning model, feature importance analysis and a partial dependence plot were used. The results of our study can be used as basic data for recommendations related to the adequacy of blood transfusions and are expected to ultimately contribute to the recovery of patients and prevention of excessive consumption of blood products.
Keywords: Explainable AI , Feature Importance Analysis , LightGBM , Partial Dependence Plot , Prediction , Transfusion
Blood products remain an important part of modern medicine and are valuable resources for saving the lives of patients. However, unlike industrial products manufactured in factories, the supply of blood products is solely reliant on blood donations, and the main challenge associated with the use of these products is an imbalance in blood supply and demand. In South Korea, blood management is changing rapidly because of the combination of a low birth rate, rapid aging, and an increase in the number of severely ill patients who require most of the blood supply . Therefore, the efficient use of limited blood products is becoming very important in terms of socioeconomic status and patient recovery.
Although objective standards and indicators are required to increase the adequacy of transfusions, evaluation of the adequacy of a patient-specific blood transfusion cannot rely entirely on test results and should take the complex clinical situation of the patient into consideration. However, the existing calculation methods, such as the maximum surgical blood order schedule, show limitations in properly reflecting the patient’s clinical situation , and appropriate predictions in the intensive care unit (ICU), where the clinical condition of patients changes every second, are especially difficult to make. For example, gastrointestinal bleeding is a major medical problem requiring transfusion in the ICU. However, the existing static methods tend to evaluate the need for transfusion solely based on the examination at admission . For this reason, in recent years, dynamic prediction models using machine learning have shown better performance than static models .
Previous studies in this regard have mainly dealt with the prediction of red blood cell transfusions in certain surgeries [5-7], and research on blood transfusion during hospitalization, apart from that during surgery, is limited. Moreover, ICU transfusion studies are limited to specific diseases such as gastrointestinal bleeding [8,9]. However, many other diseases require red blood cell transfusion in the ICU . The ICU contains patients prone to the worst clinical conditions, including end-organ damage. Timely transfusion is important to improve patients’ status since it generally increases blood perfusion to the organs [11,12].
In this study, we developed an explainable machine learning to predict the possibility of red blood cell transfusion for major medical diseases in the ICU. Efficient use of limited blood products is becoming very important in terms of socioeconomic status and patient recovery. However, it can be difficult to determine whether a patient needs a blood transfusion. Depending on the various clinical characteristics, transfusion into the patient should be considered. We consider a patient’s diverse clinical attributes to provide a basis to support a physician’s decision on whether to transfuse or not, and expect to have a beneficial socioeconomic impact on patient health and blood supply. These findings will help patients recover and will support the healthcare system to manage blood products more efficiently.
In this study, we used the Medical Information Mart for Intensive Care III (MIMIC-III) dataset to extract transfusion prescription-related data for the major medical disease groups. Through this approach, we analyzed patterns by disease group and evaluated the transfusion possibility on the based on several machine learning models (Fig. 1).
MIMIC-III is an open dataset of ICU records accumulated from 2001 to 2012 at the Beth Israel Deaconess Medical Center. It is publicly available as a large open-source medical record database in PhysioNet [13,14 ]. It contains over 40,000 de-identified patient data points, including demographics, clinical data, laboratory data, and medications. The data include the information for many patients over multiple hospital visits. The target disease group consisted of patients who received packed red blood cell transfusions. The patients who received transfusion during the first 24 hours were labeled as the target 1, and those who did not, target 0.
2.2.1 Extraction of the target disease group and feature set
In order to consider the transfusion patterns for major internal medical diseases from a broader perspective rather than a specific disease or treatment, we first selected major disease groups for transfusion. The target disease group was selected based on the patients who were most frequently transfused. We considered patient demographics, laboratory data, and clinical data as features for model development and the laboratory variables were selected according to the considered evidence of transfusion.
2.2.2 Redefining the recorded data time
To predict the possibility of transfusion during the first 24 hours after admission to the ICU, the time of clinical and laboratory data and date of birth should be recorded from the time when the patient is admitted to the intensive care unit and monitoring begins. This converted time is represented on a 0-24-hour scale composed of a total of six 4-hour time windows determined using the moving average method (Fig. 2).
2.2.3 Processing missing values and imbalanced data
The percentage of rows with ≥70% missing values accounted for 65% of the total data. To solve this problem, data with ≥70% missing values were removed, and for data with a smaller proportion of missing values, the average value for each patient was used to fill the empty values. After preprocessing, missing values accounted for an average of 10%. The major features of our model, such as the hemoglobin and hematocrit values, had very few missing values (7%) (Fig. 3).
The ratio of 0 to 1 of transfusion after data preprocessing showed a difference of approximately 11:1. The upsampling method was applied as an unbalanced class solution to obtain a ratio of 1:1.
2.3 Model and Evaluation
In this study, the learning model was implemented using the XGBoost (max_depth=5, learning_rate =0.1, num_estimators=1000), LightGBM (max_depth=7, num_leaves=128, num_estimators=1000), RandomForest (max_depth=12, n_estimators=750, min_samples_split=2, min_samples_leaf=1), and LSTM (number of layers=2) models, and the AUC-ROC curve was used to classify the performance of the models. In addition, to secure the explainability of the machine learning model, we used feature importance analysis and partial dependence plots (PDPs). Feature importance assigns a score to the input features used in the model based on their usefulness in predicting a target variable. PDP shows the marginal effect one or two features have on the predicted outcome of a machine-learning model . It can show whether the relationship between the target and a feature is linear, monotonic, or more complex.
3.1 Results of Data Extraction
A total of 29 input features were used, namely three demographic characteristics (age, race, and sex), six clinical monitoring values (systolic and diastolic blood pressure, heart rate, respiratory rate, temperature, and oxygen saturation), and 20 laboratory values. The output class was indicated by 0 and 1 as to whether or not to receive a blood transfusion.
The most frequent diagnosis among patients transfused with red blood cells was circulatory diseases such as cardiac dysfunction, infectious diseases such as sepsis, gastrointestinal diseases such as gastrointestinal bleeding and gastric ulcers, and respiratory conditions such as tracheostomy and ventilator-related treatments.
Data from 16,222 patients were finally extracted from the disease group and excluded if there were ≥70% missing values. In addition, outliers were removed using the interquartile range (IQR) method. As a result, 68,460 data points were retained. The number of data points for red blood cell transfusion during the first 24 hours was 5,818.
The distribution of the 16,222 patients in selected disease groups is shown in Table 1. Among these groups, more transfusions were received by the circulatory disease group (e.g., cardiac dysfunction or cardiac procedure) than any other group.
3.2 Results of Machine Learning
Table 2 lists the classification accuracy results of the evaluation of the model using the test set. The LightGBM model exhibited the best prediction performance. Thus, LightGBM was chosen as the final prediction model. The AUC of LightGBM was 0.9070, and the F1-score was 0.8166. The results indicate that LightGBM is a suitable model for predicting blood transfusion data.
3.3 Explainable Analysis
Feature importance analysis suggested that hemoglobin, creatinine, hematocrit, arterial diastolic blood pressure, and heart rate were highly important (Fig. 4). This finding is consistent with clinical evidence showing that hemoglobin is the most important attribute in the indication of transfusion and also supports that creatinine can be an important predictor . In addition, blood pressure, respiration rate, and heart rate attributes are important vital signs.
We found that patients with hemoglobin values ranging from 7 to 9 g/dL required the most transfusion, and that these values had the greatest effect on transfusion possibility. Moreover, the smaller the hemoglobin value, the greater the predictive value of the model (Fig. 5(a)). In contrast, most patients had hemoglobin values ranging from 10.53 to 10.7 g/dL, and they required little transfusion. In addition, in patients with hemoglobin values >10.7 g/dL, almost no transfusion was received. In the plot of trans¬fusion predictability by hemoglobin distribution range (Fig. 5(b)), the range of hemoglobin values from 7 to 9 g/dL had the most powerful influence on the prediction. However, this possibility is not notably greater than 50%. Therefore, hemoglobin features alone cannot determine the need for transfusion.
From the partial dependence plot, we determined the extent to which the hemoglobin value affects the prediction of transfusion possibility as the value increases based on the smallest hemoglobin value (Fig. 6). If the hemoglobin value exceeds 10 g/dL, the influence on blood transfusion prediction is greatly reduced. That is, the smaller the hemoglobin value, the greater the predictive power of the model.
Unlike previous studies that focused on a specific disease group, we expanded the disease group and broadened the universality of the target disease group by selecting items in which packed red blood cell transfusion occurs at high frequency. In particular, it is important that the key features used in the machine learning model to predict blood transfusions are consistent with the actual clinical situation. Our findings confirmed that the features corresponding to the highest feature importance in the model with the highest performance were applied as important grounds for clinical judgment.
In this study, various methods for missing values could not be applied. Improvements will be made in various directions, such as using weighted average values, using the amount of mutual information for each feature, and by replacing missing values using forward fill and interpolation methods. In addition, research from various perspectives, such as securing additional data, will be attempted. We are planning to use data from Konyang University Hospital in future work.
The results of our study can be used as basic data for recommendations related to the adequacy of blood transfusions and are expected to ultimately contribute to the recovery of patients and prevention of excessive consumption of blood products.
She received a B.S. degree in Mathematics from Chungnam National University (CNU) in 2004, and Ph.D. in dynamical systems from CNU in 2013. She worked as a post-doctor for MathVision2020 in CNU from 2013 to 2016, and for the Medical Data Analytics Team in National Institute for Mathematical Sciences (NIMS) from 2016 to 2019. She is currently a research professor in the Healthcare Data Science Center, Konyang Univ. Hospital, Daejeon, Korea. Her research interests include dynamical systems, mathematical data analytics, and medical AI.
He received a B.S. degree in computer engineering from Hansung University in 2020, and plans to complete a M.S. degree in computer engineering prospectively from Hansung University in 2022. He works as an intern in Konyang University and is a logistics consultant. His research interests include embedded systems, quantization, machine learning, deep learning, and logistics automation systems.
She received a B.S. degree from the Department of Computer Science, EWHA Womans University, M.S. degree from the Department of Computer Science and Engineering, Seoul National University, and Ph.D. from the Department of Computer Science, Suwon University, South Korea, in 2002, 2005, and 2010, respectively. She worked on a proposal for building fault-tolerant medical information systems for DoD, U.S. as a Visiting Fellow at the Department of Radiology, Imaging Science and Information Systems (ISIS) Center at Georgetown University, USA. She also worked as an instructor and senior researcher in the u-Healthcare Institute at Gachon University, South Korea. Since 2013, she has been working in the School of Information and Communication Science at the Semyung University, South Korea. Her research interests include computational intelligence, big data analysis, and intelligent healthcare systems.
He received M.D., M.S. and Ph.D. degrees from the Konyang University School of Medicine in 2003, 2006, and 2018, respectively. He is currently the head of the Healthcare Data Science Center at Konyang University Hospital, the head professor at the Department of Information Medicine, and an associate professor at the Department of Otolaryngology, Konyang University College of Medicine, Daejeon, Korea. His current research interests include health and medical big data, medical artificial intelligence, and clinical decision support systems.
She received a B.S. degree in Biomedical Engineering from Gachon University in 2010, and M.S. and Ph.D. degrees in Medical Informatics from Seoul National University in 2012 and 2017. She is currently the Vice Director of Healthcare Data Science Center at Konyang University Hospital and an assistant professor at the Department of Biomedical Informatics, Konyang University College of Medicine, Daejeon, Korea. Her research interests include pharmacovigilance and PGHD.