脓毒症相关肝损伤预后分析及基于机器学习方法的预测模型建立

Prognostic analysis of sepsis-related liver injury and development of a prediction model based on machine learning method

  • 摘要:
    目的 分析脓毒症相关肝损伤(SRLI)患者的预后, 并使用8种机器学习方法建立脓毒症患者入住ICU后发生SRLI的预测模型。
    方法 纳入MIMIC-IV数据库中满足脓毒症诊断标准且无肝脏、胆系基础疾病的患者。根据肝酶≥5倍正常值上限(ULN)或胆红素≥2.0 mg/dL将患者分为SRLI组和非SRLI组。采用卡方检验、多因素Logistics回归分析及倾向性评分匹配法分析2组患者死亡风险。采用8种机器学习算法Logistics回归、分类回归树(CART)、随机森林(RF)、支持向量机(SVM)、K-近邻(K-NN)、朴素贝叶斯(NBM)、极端梯度提升(XGBoost)、梯度提升树(GBDT)构建SRLI预测模型并进行验证。
    结果 卡方检验(P < 0.001)、多因素Logistics回归分析(P < 0.05)、倾向性评分匹配分析后Log-rank (P < 0.05)均显示SRLI增加患者死亡风险。SRLI预测模型中, RF算法的曲线下面积(AUC)最高为0.866, 其后依次是GBDT (AUC=0.862)、Logistics回归(AUC=0.859)、SVM (AUC=0.837)、NBM (AUC=0.830)、CART (AUC=0.771)、XGBoost (AUC=0.764)、K-NN (AUC=0.722)。
    结论 SRLI增加患者死亡风险。RF算法构建预测模型有较高的诊断价值。

     

    Abstract:
    Objective To analyze the prognosis of patients with sepsis-related liver injury (SRLI) and establish a prediction model for the occurrence of SRLI after ICU admission in sepsis patients using eight machine learning methods.
    Methods Patients who met the sepsis diagnostic criteria and had no underlying liver or biliary diseases were included from the MIMIC-IV database, and were classified into SRLI and non-SRLI groups based on liver enzymes ≥5 times the upper limit of normal (ULN) or bilirubin ≥2.0mg/dL. Chi-square test, multivariate Logistic regression analysis, and propensity score matching were used to analyze the mortality risk between the two groups. Eight machine learning algorithmsLogistic regression, classification and regression tree (CART), random forest (RF), support vector machine (SVM), K-nearest neighbors (K-NN), naive Bayes method (NBM), extreme gradient boosting (XGBoost), and gradient boosting decision tree (GBDT)were employed to construct and validate the SRLI prediction model.
    Results The chi-square test (P < 0.001), multivariate Logistic regression analysis (P < 0.05), and log-rank test after propensity score matching (P < 0.05) all indicated that SRLI increased the mortality risk of patients. Among the SRLI prediction models, RF algorithm had the highest area under the curve (AUC), with its value of 0.866, followed by GBDT (AUC=0.862), Logistic regression (AUC=0.859), SVM (AUC=0.837), NBM (AUC=0.830), CART (AUC=0.771), XGBoost (AUC=0.764), and K-NN (AUC=0.722).
    Conclusion SRLI increases the mortality risk of patients. The prediction model constructed using the RF algorithm has high diagnostic value.

     

/

返回文章
返回