Continuous-time Markov decision processes with exponential utility

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)

Abstract

In this paper, we consider a continuous-time Markov decision process (CTMDP) in Borel spaces, where the certainty equivalent with respect to the exponential utility of the total undiscounted cost is to be minimized. The cost rate is nonnegative. We establish the optimality equation. Under the compactness-continuity condition, we show the existence of a deterministic stationary optimal policy. We reduce the risk-sensitive CTMDP problem to an equivalent risk-sensitive discrete-time Markov decision process, which is with the same state and action spaces as the original CTMDP. In particular, the value iteration algorithm for the CTMDP problem follows from this reduction. We essentially do not need to impose a condition on the growth of the transition and cost rate in the state, and the controlled process could be explosive.

Original languageEnglish
Pages (from-to)2636-2660
Number of pages25
JournalSIAM Journal on Control and Optimization
Volume55
Issue number4
DOIs
Publication statusPublished - 2017

Bibliographical note

Funding Information:
∗Received by the editors July 25, 2016; accepted for publication (in revised form) April 3, 2017; published electronically August 30, 2017. http://www.siam.org/journals/sicon/55-4/M108626.html Funding: This work was supported by a financial grant from the Research Fund for Coal and Steel of the European Commission, within the INDUSE-2-SAFETY project (grant RFSR-CT-2014-00025). †Department of Mathematical Sciences, University of Liverpool, Liverpool, L69 7ZL, UK ([email protected]).

Publisher Copyright:
© 2017 Society for Industrial and Applied Mathematics.

Keywords

  • Continuous-time Markov decision processes
  • Exponential utility
  • Optimality equation
  • Risk-sensitive criterion
  • Total undiscounted criteria

ASJC Scopus subject areas

  • Control and Optimization
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Continuous-time Markov decision processes with exponential utility'. Together they form a unique fingerprint.

Cite this