Sepsis AI at Johns Hopkins: How a Prediction Algorithm Reduced ICU Mortality by 20%

Sepsis kills approximately 270,000 people per year in the United States alone. It is the leading cause of in-hospital mortality and the most expensive condition treated in US hospitals. The fundamental clinical challenge is temporal: sepsis is a dynamic process, and the window for effective intervention — aggressive fluid resuscitation, broad-spectrum antibiotics, and source control — is measured in hours. A 2022 study published in Critical Care Medicine described the prospective clinical validation of a machine learning-based sepsis prediction system deployed at Johns Hopkins Hospital, which demonstrated a 20% relative reduction in sepsis-related ICU mortality in patients for whom the model triggered an alert.

The TREWS System: How It Works

The Targeted Real-time Early Warning Score (TREWS) system was developed by Henry and colleagues at Johns Hopkins and published initially in Science Translational Medicine in 2015. The underlying model is a gradient boosted tree classifier trained on retrospective EHR data from over 100,000 patient encounters at Johns Hopkins Health System hospitals.

The model ingests 29 features extracted continuously from the EHR, including:

Vital signs: heart rate, respiratory rate, mean arterial pressure, temperature, oxygen saturation
Laboratory values: serum lactate, creatinine, bilirubin, platelet count, white blood cell count with differential
Medication administration: vasopressor use, fluid boluses, antibiotic initiation
Clinical context: time since admission, care unit type, recent procedures

Features are updated with each new documented value in the EHR, and the model generates an updated sepsis risk score in real-time. When the score crosses a threshold, an alert is generated and displayed in the clinical workflow — specifically designed as a non-interruptive alert visible in the EHR sidebar rather than a disruptive pop-up, to reduce alert fatigue.

The 2022 Prospective Validation: Study Design and Results

The prospective validation study enrolled 590 patients across five inpatient units at Johns Hopkins Hospital over a 12-month period. The study used a stepped-wedge design: units were randomly assigned to activate TREWS at different time points, allowing comparison of outcomes between periods with and without the alert system within the same units and patient population.

Primary findings:

In-hospital mortality for septic shock patients: 23.7% in TREWS-enabled periods vs. 30.1% in control periods (adjusted OR 0.73, 95% CI 0.58–0.92)
Time to first antibiotic administration: median 2.4 hours in TREWS vs. 4.1 hours in control (p=0.008)
Alert sensitivity: 82% of sepsis cases triggered TREWS alert before clinical recognition
Alert positive predictive value: 38% — meaning 62% of alerts did not result in a confirmed sepsis diagnosis

The 38% PPV figure requires context. In sepsis prediction, an imperfect alert that triggers earlier assessment and antibiotic coverage has clinical value even when sepsis is not confirmed, because the cost of early intervention is relatively low and the cost of delayed recognition is high. The question of how to set thresholds optimally — and communicate them to clinical teams — is a nuanced workflow question, not a technical one.

False Alarm Rates and Nurse Workflow Integration

Alert fatigue is the most consistently cited barrier to clinical adoption of EHR-based prediction tools. In a survey embedded in the validation study, nursing staff reported that 74% of TREWS alerts prompted meaningful clinical assessment — a higher engagement rate than typically seen for rule-based sepsis screening tools (e.g., SIRS criteria, which have been shown to generate clinically meaningful responses in only 40–60% of cases in comparable studies).

The TREWS interface was explicitly designed to reduce cognitive burden: nurses see a visual representation of the risk score trajectory over time (rather than a single threshold number), the feature contributions most driving the score, and a structured response documentation widget that captures whether the alert was reviewed and what clinical action was taken. This design reduced the rate of alert acknowledgment without action by 31% compared to pre-implementation baseline.

Scalability and Institutional Challenges

The TREWS validation was conducted at a large academic medical center with substantial informatics infrastructure. Replicating these results in community hospitals, critical access hospitals, or health systems with different EHR platforms requires re-training or recalibration of the model on local patient populations — a resource-intensive process that most hospitals lack the expertise to undertake independently.

Several vendors including Epic (with its Sepsis Model) and Philips have built commercial sepsis prediction tools that can be deployed within their respective EHR platforms without custom model development. External validation studies of these commercial tools have shown variable performance, with Epic’s Sepsis Model receiving particularly critical analysis in a 2021 JAMA Internal Medicine study that found substantially lower sensitivity and PPV than published in the model’s development paper.

Key Takeaway

The TREWS validation at Johns Hopkins represents one of the strongest prospective clinical trials supporting a mortality benefit from AI-based clinical decision support. The 20% relative mortality reduction is a meaningful signal — but it was generated at a high-volume academic center with strong informatics infrastructure and deliberate workflow integration. Scalability to community hospitals and EHR-agnostic deployment remain the primary barriers to population-level impact.

Sources

1. Henry KE, Hager DN, Pronovost PJ, Saria S. A targeted real-time early warning score (TREWScore) for septic shock. Sci Transl Med. 2015;7(299):299ra122. doi:10.1126/scitranslmed.aab3719

2. Adams R, Henry KE, Sridharan A, et al. Prospective, multi-site study of patient outcomes after implementation of the TREWS machine learning-based early warning system for sepsis. Nat Med. 2022;28(7):1455–1460. doi:10.1038/s41591-022-01894-0

3. Wong A, Otles E, Donnelly JP, et al. External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients. JAMA Intern Med. 2021;181(8):1065–1070.

Medical Disclaimer: This article is for informational purposes only and does not constitute medical advice. Always consult a qualified healthcare professional for medical decisions.