Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Improving Treatment Decisions for Sepsis Patients by Reinforcement Learning

Lyu, Ruishen (2020) Improving Treatment Decisions for Sepsis Patients by Reinforcement Learning. Master's Thesis, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

Submitted Version

Download (1MB) | Preview


Sepsis is defined as a dysregulated immune response to infection leading to acute life-threatening organ dysfunction. Patients with sepsis have 25.8% intensive care unit (ICU) mortality, which was significantly higher than in the general ICU population. Making optimal medication decisions becomes an emergent and important task. The purpose of this study is to develop a data-driven decision-making tool that can dynamically suggest optimal treatments for each individual ICU patient with sepsis, and help clinicians make better treatment decisions to improve patients’ long-term survival outcomes.
Model-free Q-learning was applied to data extracted from the eICU Research Institute (eRI) database. We selected 3,800 patients admitted to ICUs with septic shock and summarized their first 7 days of lab results and vitals into 18,014 daily records. To identify best treatment decisions of vasopressor use, we first clustered patients’ demographics and daily medical conditions into 100 distinct states. We then mapped ICU survival to time-dependent rewards and estimated the Q-values for each action taken at each state using temporal-difference learning. Finally, we obtained the optimal policy that maximizes the action-value function by policy iteration. An off-policy evaluation method was implemented to evaluate the performance of several treatment policies.
The result showed that the Q-learning policy has significantly higher long-term average reward than the clinician policy or the random policy, meaning that patients who received treatments matching those suggested by the Q-learning policy had better survival outlook than those who did not.
In conclusion, we showcased that the prospect of long-term survival may be improved through using modern reinforcement learning methods that optimizes the rewards against the dynamics of the environment.
Public Health Significance: We developed a data-driven automatic reinforcement learning tool and applied it to an electronic health database of sepsis patients. The result showed that medical decisions that matched our Q-leaning policy led to better survival outlook; this suggests that machine learning can be used to help clinicians decide the most effective treatments and reduce the burden on medical and economical resources.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Lyu, Ruishenrul40@pitt.edurul40
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorTang,
Thesis AdvisorChang, Chung-Chou
Committee MemberMayr,
Centers: Other Centers, Institutes, Offices, or Units > Center for Bioengineering
Date: 31 March 2020
Date Type: Submission
Defense Date: 17 April 2020
Approval Date: 30 July 2020
Submission Date: 12 May 2020
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 46
Institution: University of Pittsburgh
Schools and Programs: School of Public Health > Biostatistics
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: Dynamic treatment regime, Electronic Health Records, Machine Learning
Date Deposited: 30 Jul 2020 19:08
Last Modified: 30 Jul 2020 19:08

Available Versions of this Item

  • Improving Treatment Decisions for Sepsis Patients by Reinforcement Learning. (deposited 30 Jul 2020 19:08) [Currently Displayed]


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item