Development of Predictive Models Based on Biochemical Parameters to Triage COVID-19 Patients: A Study Conducted in a Tertiary Care Hospital

Background The COVID-19 disease continues to cause severe mortality and morbidity. Biochemical parameters are being used to predict the severity of the infection. This study aims to predict disease severity and mortality to help reduce mortality through timely intervention in a cost-effective way. Methods A total of 324 COVID-19 cases admitted at our hospital (All India Institute of Medical Sciences, Patna, BR, India) between June 2020 to December 2020 (phase 1: 190 patients) and April 2021 to May 2021 (phase 2: 134 patients) were recruited for this study. Statistical analysis was done using SPSS Statistics version 23 (IBM Corp., Armonk, NY, USA) and model prediction using Python (The Python Software Foundation, Wilmington, DE, USA). Results There were significant differences in biochemical parameters at the time of admission among COVID-19 patients between phases 1 and 2, ICU and non-ICU admissions, and expired and discharged patients. The receiver operating characteristic (ROC) curves predicted mortality solely based on biochemical parameters. Using multiple logistic regression in Python, a total of four models (two each) were developed to predict ICU admission and mortality. A total of 92 out of 96 patients were placed into the correct management category by our model. This model would have allowed us to preserve 17 of the 21 patients we lost. Conclusions We developed predictive models for admission (ICU or non-ICU) and mortality based on biochemical parameters at the time of admission. A predictive model with a significant predictive capability for IL-6 and procalcitonin values using normal biochemical parameters was proposed. Both can be used as machine learning tools to prognosticate the severity of COVID-19 infections. This study is probably the first of its kind to propose triage for admission in the ICU or non-ICU at the medical emergency department during the first presentation for the necessary optimal treatment of COVID-19 based on a predictive model.


Introduction
The devastating effects of COVID-19, which emerged from the newly discovered SARS-CoV-2, have been felt all over the world.The virus emerged in Wuhan, China, in December 2019 and spread worldwide, causing severe morbidity and mortality [1,2].The disease causes mild to severe lower respiratory tract symptoms [3].This condition was initially poorly understood, making diagnosis and treatment difficult.Based on clinical signs and saturation of peripheral oxygen (SpO2), the WHO classified COVID-19 as mild, moderate, or severe [4].The WHO's severity classification is simple but does not convincingly predict mortality among COVID-19 patients.Biochemical parameters such as IL-6, ferritin, D-dimer, and procalcitonin are used to assess the severity of the COVID-19 infection.Our study aims to develop a model based on these laboratory biochemical parameters to predict admission criteria (ICU or non-ICU) as well as mortality at the initial presentation of COVID-19-infected patients in the ED.Both IL-6 and procalcitonin are established biochemical markers of COVID-19 severity [5][6][7][8][9][10][11].However, the primary health centers (PHCs) of developing countries lack equipment for measuring IL-6 and procalcitonin.Therefore, we tried to create a model to predict the values of IL-6 and procalcitonin from routine biochemical investigations such as serum urea, neutrophil lymphocyte ratio, albumin, alanine transferase (ALT), and lactate dehydrogenase (LDH), among others.This research will help in the triage and management of future COVID-19 cases.These models can be used in machine learning to predict the severity and mortality of future patients in pandemics related to COVID-19.

Study design and setting
A retrospective observational study was conducted in a tertiary care hospital (All India Institute of Medical Sciences, Patna, BR, India), where levels of various analytes, i.e., serum electrolytes of sodium (Na) and calcium (Ca), parameters of liver function tests (LFT), kidney function tests (KFT), LDH, IL-6, procalcitonin, ferritin, C-reactive protein (CRP), and neutrophil lymphocyte (NL) ratio, were collected from COVID-19positive patients who were admitted during the first (June 2020 to December 2020; referred to in this study as phase 1) and second wave of the pandemic (April 2021 to May 2021; referred to as phase 2).The LFT, KFT, CRP, and LDH were measured in the Beckman Coulter AU680 (Beckman Coulter Inc., Pasadena, CA, USA), and IL-6 and ferritin in the Siemens Advia Centaur (Siemens Healthineers, Erlangen, BY, DEU) per the manufacturer's protocol.Procalcitonin was measured in the Ortho Clinical Diagnostics Vitros 5600 (QuidelOrtho, San Diego, CA, USA).This study examined 14 biochemical parameters, namely NL ratio, Ddimer, international normalized ratio (INR), Na, Ca, creatinine, urea, albumin, LDH, CRP, ALT, ferritin, procalcitonin, and IL-6.All patient-related data was collected from the hospital's information system.

Inclusion and exclusion criteria
Patients who tested positive for COVID-19 via reverse transcription-polymerase chain reaction (RT-PCR) and were admitted to our hospital (both ICU and non-ICU) during our study period (phase 1 and phase 2) were included in our study.During phases 1 and 2, COVID-19-positive patients were admitted per their severity classification.Mild included those patients with symptoms such as fever, cough, and sore throat with a respiratory rate ≤ 24 per minute and SpO2 > 94% (range: 94% to 100%) on room air or 200 < partial pressure of oxygen (PaO2) in arterial blood/fraction of inspired oxygen (FiO2) < 300 mmHg.Those in the moderate category have clinical features of dyspnea and/or hypoxia with a respiratory rate > 24 per minute and SpO2 ≥ 90% on room air, or PaO2/FiO2 = 100 to 200 mmHg.The severe category is characterized by clinical signs of pneumonia and any one of the following: respiratory rate > 30 breaths/min, severe respiratory distress, SpO2 < 90% on room air, or PaO2/FiO2 ≤ 100 mmHg.The severe category of patients were admitted to the ICU, and the mild and moderate category of patients were admitted to the non-ICU (ward/high dependency unit (HDU)) for treatment [4].The COVID-19-negative patients admitted during the study period due to any other cause were not included in this study.

Statistical analysis
For data analysis, SPSS Statistics version 23 (IBM Corp., Armonk, NY, USA) was used.The prediction model was built using Python version 3.6 (The Python Software Foundation, Wilmington, DE, USA).We used logistic regression for ICU admission and mortality prediction.Categorical variables were presented as proportions.Continuous variables were presented as median and interquartile range values.All parameters were tested for normality using the Shapiro-Wilk test.Since all biochemical variables were non-normally distributed, Mann-Whitney U-tests were used for continuous variables and chi-square tests for categorical variables.Spearman's correlation was used to assess the relationship between ICU admission and mortality using biochemical variables taken at admission.Receiver operating characteristic (ROC) curve analysis determined the optimum cut-off points of significantly correlated biochemical variables with disease severity and area under the curve, with a 95% confidence interval and significance value for each variable.The index of union determined the cut-off point.A two-tailed p-value < 0.05 was considered significant.
For designing the prediction model using Python, we predicted ICU admission and mortality status using input variables.The dichotomous variable was predicted using logistic regression.Two input variables, the feature importance coefficient, and the correlation coefficient, were used in model prediction.Using feature importance selection, we selected only input variables that predicted ICU admission and mortality.We chose input variables highly correlated with the ICU using the correlation coefficient.We calculated all performance metrics to evaluate our model.Our input variable model covers both groups (ICU and non-ICU).Our model produced Z log (odds of in ICU/mortality).The sigmoid function calculates probability from log odds Z.The sigmoid function describes this transformation: ICU probability = 1 / (1 + e^(-Z)).In our dataset, if the probability of ICU admission was greater than 50%, the patient should be admitted to the ICU for better outcomes, and if the mortality probability was greater than 50%, the patient had a very high mortality risk.
We constructed ROC curves in SPSS Statistics based on the probabilities we got from the ICU and death prediction models (based on the correlation coefficient).From that ROC curve, we got the cutoff values for ICU admission and mortality.For ICU admission, the probability cutoff value was 40.64%, and for mortality, it was 51.14%, but in our dataset, we have taken the cutoff probabilities as 50% for both models.As we chose 50% probability, the Brier score is (0.50−1)2 = (0.50−0)2 = 0.25.By increasing the ICU admission cutoff value, we increased the specificity of our model so that there would be less waste of ICU beds, considering the scarcity of ICU beds.By decreasing the mortality cutoff value, we increased the sensitivity of our model to prevent mortality as much as possible.
We also predicted IL-6 and procalcitonin using low-resource input variables.Multiple linear regression predicts the outcome because all output variables are continuous.The correlation coefficient lets us find the input variables for each output variable.We chose only highly correlated input variables for each output variable using the correlation coefficient.We then calculated all performance metrics to see how our model performed for each case.

Results
Out of 324 patients, 190 in phase 1 and 134 in phase 2 were included in our study.

TABLE 2: Demographic details and comparison between the biochemical parameters of phases 1 and 2
The 'n' denotes the number of samples taken into consideration for analysis of that particular parameter.All values except age are median and range.
NL ratio: Neutrophil to lymphocyte ratio, Na: Sodium, Ca: Calcium, CRP: C-reactive protein, ALT: Alanine transaminase, INR: International normalized ratio Biochemical parameters between phases 1 and 2 taken at the time of discharge were compared.Three parameters showed significant differences: NL ratio, Na, and ALT.Both phases had normal Na levels, but phase 2 discharges had a higher NL ratio and ALT than phase 1 discharges.
We compared biochemical parameters at the time of admission in phases 1 and 2 among the deceased patients.In both phases, Na, ferritin, and procalcitonin showed significant differences.Phase 2 had higher median values for all three parameters.
A comparison of phase 1 and phase 2 biochemical parameters in the last drawn blood samples among deceased patients was made.The INR, creatinine, and urea were the only three parameters out of 14 that showed statistically significant differences.Both phases exhibited a normal INR.In phase 1 fatalities, the levels of creatinine and urea were higher than in phase 2 deaths.

TABLE 3: Comparison of biochemical parameters between ICU and non-ICU patients at the time of admission
The 'n' denotes the number of samples taken into consideration for analysis of that particular parameter.All values are median and range.
NL ratio: Neutrophil to lymphocyte ratio, Ca: Calcium, CRP: C-reactive protein, INR: International normalized ratio Four biochemical parameters showed a significant difference between ICU and non-ICU patients at the time of discharge, while INR, Ca, and procalcitonin were normal.The ICU patients had higher ferritin levels than non-ICU patients.
Table 4 shows the comparison between the biochemical parameters taken at the time of admission in COVID-19 patients who succumbed to the disease and those who were discharged.Eleven of the 14 parameters showed a significant difference between those discharged and those who expired.The NL-ratio, D-dimer, urea, LDH, IL-6, CRP, and ferritin were significantly higher in deceased patients.Though within normal limits, discharged and deceased patients had significant differences in Ca, creatinine, albumin, and procalcitonin.Deceased patients had higher creatinine and procalcitonin and lower Ca and albumin when compared with the discharged patients.The 'n' denotes the number of samples taken into consideration for analysis of that particular parameter.All values are median and range.

Biochemical parameters at the time of admission (normal range)
The COVID-19 patients admitted to non-ICUs were compared for biochemical parameters taken at the time of admission.Eight of the 14 parameters showed a significant difference between discharged and deceased patients.Deceased patients had a higher NL-ratio, urea, LDH, CRP, and ferritin.Though within the normal range, Ca, albumin, and procalcitonin values differed between the two groups.Deceased patients had higher procalcitonin and lower Ca and albumin.
We compared biochemical parameters taken at the time of discharge or death between discharged and deceased patients admitted to the non-ICU.Twelve parameters showed a significant difference.Deceased patients had a higher NL ratio, D-dimer, creatinine, urea, LDH, CRP, ferritin, and procalcitonin than discharged patients.Calcium and albumin were lower in deceased patients, but INR and Na were higher.
We contrasted the biochemical parameters measured at the time of ICU admission between the deceased and the discharged patients.Sodium, urea, and LDH are three biochemical parameters that differed significantly between patients who were discharged and those who died.The Na levels of both discharged and deceased patients were within the normal range.Patients who died had substantially higher levels of urea and LDH than patients who were discharged.
We also compared biochemical parameters taken at the time of discharge or death between discharged and deceased ICU patients.Ten of the 14 parameters showed a significant difference between those who were discharged and those who died.Deceased patients had higher NL-ratio, D-dimer, creatinine, urea, LDH, IL-6, and CRP values.Deceased patients had lower Ca and albumin levels and higher Na levels.
Many hospitals lack specialized investigations.Thus, using routinely tested parameters, we attempted to predict baseline IL-6 and procalcitonin.We calculated IL-6 and procalcitonin values using simple variables: Next, we wanted to determine a specific cutoff value for the biochemical parameters that can be used to predict the severity of the disease category, independent of the clinical categorization.Therefore, the ROC curves of significantly associated biochemical variables were determined (Table 5, Figure 1).The first ICU prediction model (M1 ICU) used the parameter feature importance.The accuracy was 82+/-5%.
Our logistic regression equation is Z = B0+B1X1+B2X2+B3X3+B4X4.The details of these models are given in Table 6.

Discussion
In the fight against COVID-19, knowledge of biochemical parameters and their correlation with disease severity is crucial to early intervention.A scientific model to triage patients by severity will reduce morbidity and mortality.We compared biochemical parameters in Indian COVID-19 patients during the first and second waves of the pandemic, between those admitted to the ICU and non-ICU, and among discharged and deceased patients.These biochemical parameters were noted at the time of admission and also before discharge or death.This was done to identify the best set of biochemical parameters, which were then used to construct the predictive model.
Patients admitted in phase 2 had higher NL ratios and lower CRPs than in phase 1 during admission.Patients who got discharged in phase 2 had a higher NL ratio and ALT than those in phase 1.In developing countries like India, the overuse of corticosteroids may be due to the benefits observed during phase 1.
Corticosteroids cause lymphocytopenia and thus increase the NL ratio [12].The CRP is a marker of inflammation and was higher in phase 1 than phase 2, possibly due to COVID-19 patients' pre-hospital use of steroids [13].Among the patients who succumbed to COVID-19, phase 1 showed higher creatinine and urea values before death than in phase 2. The COVID-19 infection affects the kidneys and lungs.Acute kidney injury due to COVID-19 increases urea and creatinine [14].In phase 2, the use of corticosteroids possibly reduced COVID-19-related inflammation and kidney injury, thus lowering urea and creatinine when compared to phase 1.Among the deceased patients, phase 2 had higher ferritin and procalcitonin levels than phase 1.These corroborate the reality that the COVID-19 second wave was more severe [15,16].
Next, we analyzed the difference in biochemical parameters between ICU and non-ICU admissions.Ten parameters out of 14 differed significantly between ICU and non-ICU admissions.The ICU patients had severe COVID-19 infections, so their biochemical parameters were higher than those of non-ICU patients except for Ca and albumin, which correlated negatively with the severity of the infection.The levels went down in both groups at the time of discharge but were still higher in ICU patients, especially ferritin, due to the very high levels present at admission.Further, we wanted to see the differences in the biochemical parameters among the patients who survived COVID-19 and those who succumbed to it.Eleven biochemical parameters out of 14, measured at admission, were significantly different between COVID-19 survivors and the deceased.The NL-ratio, D-dimer, urea, LDH, IL-6, CRP, and ferritin were significantly higher in patients who succumbed.Among non-ICU patients upon admission, we observed significant differences in eight out of 14 biochemical parameters between those who survived COVID-19 and those who succumbed to it, indicating severe illness in the affected patients.The differences expanded to encompass 12 out of 14 parameters when comparing the values of these parameters before discharge or death, highlighting a notable shift in health status.
Among the ICU patients at the time of admission, we found significant differences in three of the 14 biochemical parameters between those who survived COVID-19 and those who expired, namely Na, urea, and LDH.Urea and LDH were significantly higher in those who succumbed, but Na was within the normal range in both groups, though the difference was significant.Studies have shown that COVID-19 patients with higher LDH levels are at increased risk of death [17,18].The notable variances in biochemical parameters extended to 10 of the 14 parameters analyzed when comparing the values obtained before discharge or death.
Among all these biochemical parameters compared between ICU/non-ICU and discharge and death, nine biochemical parameters were common and had significantly higher values among the deceased patients when compared with the survivors, taken just before death or discharge.This indicates that the WHO classification [4] of the patients according to severity is robust in categorizing patients for ICU but doesn't hold good for the non-ICU category or predicting mortality.Therefore, patient categorization should also consider these eight biochemical parameters.Significant differences in the 12 biochemical parameters were measured just before discharge or death, which further reinforces our observation.During their non-ICU stay, the condition of some patients worsened.The deterioration may have been mitigated by ICU admissions.Those deceased non-ICU patients who were admitted according to the clinical admission criteria should have been admitted to the ICU according to our model.In a pandemic, once a patient's condition worsens, they should be shifted to the ICU.However, this is subject to the availability of beds in the ICU.Our model predicted ICU admission at the time of admission for these deceased non-ICU patients.
In a study by Kumari et al., high levels of laboratory parameters such as IL-6, LDH, prothrombin time (PT), INR, activated partial thromboplastin time (aPTT), ferritin, WBC count, and D-dimer were significantly associated with poor outcomes [19].Similar findings were seen in our study.Age, high-sensitivity CRP level, lymphocyte count, and D-dimer level among COVID-19 patients at admission were found to be informative of the outcomes and aided our model building [20].In another study, it was found that the variables that exerted the greatest influence on mortality prediction were ferritin, fibrinogen, D-dimer, platelet count, CRP, PT, invasive mechanical ventilation (IMV), PaFi (PaO2/FiO2), LDH, lymphocyte levels, aPTT, BMI, creatinine, and age [21].
The analysis of our data in this study helped formulate the prediction models.To further establish our observation, we tried to predict the cut-off values of these biochemical parameters to predict ICU admission and also developed a predictive model for triage.We wanted to find the cut-off values of biochemical parameters at the time of admission to be able to predict COVID-19 mortality.In our data set, ICU admission and mortality were highly correlated, so these cut-offs can be used for ICU admission.Table 5 and Figure 1 show biochemical variable ROC curves.The NL ratio, D-dimer, urea, LDH, IL-6, and procalcitonin were highly significant.We wanted to know, for patients admitted to the ICU, if the biochemical parameters at the time of admission and mortality are correlated.Table 1 shows the correlation coefficients of combined biochemical parameters measured at admission in phases 1 and 2. The NL ratio, D-dimer, urea, LDH, IL-6, CRP, ferritin, and procalcitonin correlated positively and significantly with ICU admission.The ICU admission was linked to higher parameters.Albumin and Ca were negatively and significantly correlated with ICU admission.The NL ratio, urea, D-dimer, creatinine, LDH, IL-6, CRP, ferritin, and procalcitonin were positively and significantly correlated with disease mortality.Higher values of these parameters were associated with COVID-19 mortality.Disease mortality due to COVID-19 showed a significant inverse correlation with albumin and Ca levels in our study.There were many parameters with statistically significant differences, and though they were in the normal range, their clinical significance is debatable.We statistically analyzed various biochemical parameters between various groups to ensure the best possible parameters were chosen for model prediction.
We developed four biochemical prediction models, two for ICU admission and two for mortality, to triage patients by severity, and we can use these probabilities to decide whether a patient should go to the ICU or not based on bed availability.Twenty-one clinically admitted (per WHO categorization) non-ICU patients succumbed to COVID-19 in our dataset.On applying our ICU prediction model, 11 patients had a high probability of ICU admission.Among the remaining 10, seven had a high probability of mortality.Had we used both models simultaneously, we could have sent a total of 18 of those 21 patients to the ICU for better management.
We applied our ICU prediction model to our study population of 324 patients.Our model predicted 117 (36%) ICU admissions and 207 (64%) non-ICU admissions.Of these 117 patients, 82 (70%) succumbed to COVID-19.Of the 82 that expired, 65 (80%) patients had a high mortality risk, while 17 had a low mortality risk.However, these 17 patients already had a high ICU admission probability.This implies that both models need to be used simultaneously for better prediction.Among those 207 non-ICU admissions, 138 (67%) patients had a low mortality probability, and 69 (33%) patients had a high mortality probability.Twentyone (30%) of those 69 patients died, and 18 (85%) of the deceased 21 were admitted to the non-ICU.Of the 138 patients with low mortality, 18 (13%) died, 15 of the deceased 18 patients were admitted to the ICU, and the remaining three were in the non-ICU.
Twenty-six patients were discharged from the ICU.Our model predicted 17 ICU admissions for these 26 discharged patients from among 126 ICU admissions.Seven of the remaining nine patients had low mortality probabilities.Effectively, seven of these 26 patients had a low probability of both ICU admission and mortality.Hence, these seven patients could have been admitted to the non-ICU, and in a resource-poor setting, those seven ICU beds could have been utilized in a much better way.
So to conclude, our model prediction capability is superior in the prediction of non-ICU admission compared to the ICU admission of COVID-19 patients.The algorithm that we suggest in this paper is: Those with a probability of more than 50 will be admitted to the ICU (d).Next, we apply our mortality model to (c).Those with a prediction of mortality probability > 50 (n = 43) will be admitted to the ICU, and those with < 50 (n = 115) will be non-ICU admissions (three deaths).
Patients can be initially classified as per WHO criteria into mild, moderate, and severe based on the clinical signs and symptoms at the presentation.The severe patients are directly admitted to the ICU.When the clinical criteria suggest non-ICU admission, the ICU admission model should be applied, and if the probability is greater than 50%, the patient should be sent to the ICU.If the ICU admission probability is less than 50%, then the mortality model should be applied, and if the mortality probability is more than 50%, then the patient should be sent to the ICU.But when in both models the probability is less than 50%, then the patient should be managed in the non-ICU.This suggests that the predictive power of our model is superior for non-ICU admissions as compared to ICU admissions.This is especially useful in resource-poor countries such as India and enables better utilization of ICU beds.
Our model analyzes parameters mostly available in tertiary healthcare settings and provides straightforward formulas for complex variables so they can be calculated from simple variables in primary healthcare centers (PHCs).Thus, if a patient goes to a PHC and both ICU and mortality probability are over 50%, they will be transferred to a nearby tertiary healthcare center without delay, thus potentially saving lives.

Limitations
The development of this model is based on the collection of biochemical parameters from a single tertiary health center.A larger sample size would have been better to form a predictive model with even higher accuracy.Also, we have not looked into the individual comorbidities of the patients.

Conclusions
The COVID-19 pandemic caused a severe and devastating effect around the world, and assessing its severity and management has been a dilemma for physicians.Biochemical parameters can be used to predict COVID-19 disease severity.We have constructed a formula to predict IL-6 and procalcitonin values from simple biochemical parameters.We have also tried to develop an ICU admission and mortality prediction model using these biochemical parameters.By using our model, we have correctly placed 92 out of 96 patients in the study into the correct management category.Furthermore, had our model been applied, it is assumed that we could have potentially saved 17 of the 21 patients we lost.This approach will pave the way for the development of similar models for other diseases and promote the use of machine learning to prognosticate patient outcomes.This study is probably the first of its kind to predict the triage of infective patients based on biochemical parameters.

Table 1
shows the correlation coefficients and feature importance coefficients of biochemical parameters.

TABLE 4 : Comparison of biochemical parameters between patients who survived and patients who succumbed to the disease
NL ratio: Neutrophil to lymphocyte ratio, Ca: Calcium, LDH: Lactate dehydrogenase, CRP: C-reactive protein, INR: International normalized ratio, Na: Sodium