google-site-verification: googlec7193c3de77668c9.html

Time-slip in AI sepsis models may inflate results, risking under- or overtreatment

[

Time for an AI checkup: Flaw found in machine learning method that processes sepsis treatment data
Causal relationships in discrete-time reinforcement learning trajectories and the effect of temporal alignment. Credit: npj Digital Medicine (2026). DOI: 10.1038/s41746-026-02625-2

AI is already boosting positive outcomes in health care and holds promise for delivering many more. It is important, however, that deployment of AI tools—especially in a life-or-death health care setting—proceeds at a thoughtful and measured pace, cautions Shengpu Tang, assistant professor of computer science at Emory University.

Tang and colleagues found a flaw in many peer-reviewed studies using the AI method known as reinforcement learning as a theoretical guide for the treatment of sepsis. The journal npj Digital Medicine published their findings.

Through simulation experiments, they demonstrated a problem with a commonly used technique for preprocessing and indexing data related to sepsis treatment. This flawed technique results in a slight time misalignment, causing the AI agent to slip off the arrow of time, leading it to sometimes use a future event to predict the past.

If the testing data for a model is misaligned in the same way, the problem remains hidden, the researchers warn. “The flaw is masked behind ‘inflated’ performance metrics that look great on paper but will fail in practice,” Tang says.

If these flawed systems for sepsis treatment were deployed in a health care setting, they would recommend either overtreatment or undertreatment in nearly half of patient states, the researchers showed.

“We found that the large majority of the papers that use reinforcement learning to analyze sepsis treatment over the last decade made this time-misalignment mistake—including our own work,” says Tang, first author of the current paper.

Tang and colleagues developed a simple workaround to avoid the flaw, representing a fundamental shift in how problems are formulated in reinforcement learning for health care.

Their simulation experiments, based on real-world clinical data, showed that when the flaw is not addressed, a reinforcement-learning algorithm to guide sepsis treatment does not decrease or increase the mortality rate of patients.

Eliminating the time-shift flaw, however, results in an 8%–10% decrease in patient mortality, simulations showed. “We hope this work serves as a wake-up call and a roadmap for building safer, more reliable reinforcement-learning models for the clinical bedside,” Tang says.

Co-authors of the paper include Sonali Parbhoo, assistant professor of electrical and electronic engineering at Imperial College in London; Jenna Wiens, professor of computer science and engineering at the University of Michigan; and Jiayu Yao, who worked on the paper while a postdoctoral fellow at Columbia University.

The high toll of sepsis

Sepsis is a serious medical condition caused when an infection triggers a life-threatening chain reaction in the body. Hospital patients are often especially vulnerable due to compromised immune systems. One in three adults who die in a hospital had sepsis during their stay, according to the Centers for Disease Control and Prevention.

Some health care systems already use AI tools to help monitor a patient’s risk for developing sepsis. The algorithms for these predictive tools are often developed using a machine-learning method known as supervised learning. Large data sets of vital signs and other statistics for patients who either did, or did not, develop sepsis are fed into the model during training. The AI model can then be deployed in real-world situations, alerting health care workers to patients with an elevated risk.

The effectiveness of risk-prediction tools led computer scientists to want to take AI a step further to help guide treatment protocols. Rapid treatment following a diagnosis is critical to prevent tissue damage, organ failure or death.

Unlike risk assessment, however, predicting a treatment protocol requires synchronizing an array of datasets in a dynamic environment—different types of treatments, such as intravenous fluids, antibiotics, blood-pressure medications and surgery; the dosage or intensity of a treatment; the duration of a treatment; a patient’s vital signs before and after treatment; and the survival/mortality rate.

A different learning framework

Reinforcement learning is needed to handle this dynamic environment and make a sequence of decisions happening over time without a single, predefined “yes” or “no” answer. For example, reinforcement learning is used to train AI algorithms to compete in turn-based games, such as chess: the AI agent observes the board, selects an action, or move, and then the competitor makes a move. The configuration of the board keeps changing and the process keeps repeating in discrete rounds.

In recent years, reinforcement-learning algorithms have been applied to a range of sequential, decision-making tasks in health care. The algorithms analyze historical treatment sequences to identify patterns associated with favorable outcomes. A learned decision-rule maps these patterns to recommended treatments based on evolving patient conditions. At each step of discrete time, the agent observes the patient’s physiological status and selects a treatment. The situation then evolves to a new state.

Tang worked on a 2020 paper as a graduate student using reinforcement learning to study best practices for sepsis treatment.

After completing that work, Tang began to suspect that the data preprocessing method often used in reinforcement learning may not deliver the most accurate result in a health care setting. He and his colleagues began digging into the issue and discovered the flaw.

A startling insight

Unlike standard reinforcement learning benchmarks that work on well-defined trajectories, health care applications often involve irregularly sampled events across time. Data entry for electronic health records, for example, may or may not occur in real time.

See also  Study finds combination therapy significantly improves outcomes for patients with metastatic colorectal cancer

The data for the state of the patient and the action taken for treatment are preprocessed for reinforcement learning by slicing them into windows of equal time length, indexing them into units of discrete time. These indexes are then aligned to form a state-action pair.

The problem occurs because the AI agent views the state of the patient as a summary of vital signs, which can only be calculated at the end of the time window. An action, however, needs to be determined at the beginning of that window.

“A patient may have been given a pill midway through the window of time,” Tang explains, “or may have started an infusion much earlier in the window, but the AI agent assumes that the decision for giving these treatments was caused by the summary of the patient’s state, which is only determined at the end of the window of time.”

Tang and colleagues began investigating other papers using reinforcement learning to train a model for sepsis treatment and found that 80% of them used the flawed method.

They also identified a simple fix to the flaw: shifting the action index backward by one time step results in the correct temporal alignment.

Getting the word out

Developers appear to have assumed that the data-management techniques used to train supervised-learning models would also apply to reinforcement-learning models.

“Many people never pause to think about how the indexes work in different situations,” Tang says. “It’s important to put in careful thought and not just work in ‘autopilot’ so that mistakes are not made in the preprocessing of data and indexing.”

As a computer scientist devoted to developing AI tools to effectively support health care workers in decision-making processes, Tang advocates for moving at a measured pace regarding the deployment of these tools.

“I’m an old-school person,” he says. “I do think that AI is moving too fast in some cases and that more scrutiny is needed.”

While the current paper focused on sepsis treatment, Tang is concerned that the flawed technique may occur in a range of reinforcement-learning models.

“People seem to keep making the same mistake again and again,” Tang says. “We want to convey the problem to more AI researchers and developers—both those focused on health care and on broader applications—to ensure that they are aware of this issue.”

Publication details

Shengpu Tang et al, Off by a beat: the effects of temporal misalignment in reinforcement learning for sepsis treatment, npj Digital Medicine (2026). DOI: 10.1038/s41746-026-02625-2

Journal information:
npj Digital Medicine


Key medical concepts

SepsisMortality

Clinical categories

Critical care medicineInfectious diseases

Provided by
Emory University


Who’s behind this story?


Lisa Lock

Lisa Lock

BA art history, MA material culture. Former museum editor, paramedic, and transplant coordinator. Editing for Science X since 2021.

Full profile →

Advertisements

Robert Egan

Robert Egan

Bachelor’s in mathematical biology, Master’s in creative writing. Well-traveled with unique perspectives on science and language.

Full profile →

Citation:
Time-slip in AI sepsis models may inflate results, risking under- or overtreatment (2026, June 5)
retrieved 5 June 2026
from https://medicalxpress.com/news/2026-06-ai-sepsis-inflate-results-overtreatment.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.




Source link

Views: 0

See also  This routine heart scan sees the danger coming long before symptoms strike

Check Also

Common blood pressure medication associated with poorer kidney outcomes in type 2 diabetes

[ Credit: Unsplash/CC0 Public Domain New research presented at the 63rd ERA Congress suggests that …

AI model enables more than a million-fold acceleration of diffuse optical tomography for real-time diagnosis

[ Example snapshots of the photon energy density at t = 0.5, 0.7, 0.9, 1.1 …

1 in 4 births in England now by emergency C-section

A quarter of all babies in England are now delivered by emergency caesarean operations, BBC …

Leave a Reply

Available for Amazon Prime