Papers
arxiv:2201.12569

Bellman Meets Hawkes: Model-Based Reinforcement Learning via Temporal Point Processes

Published on Jan 29, 2022
Authors:
,
,
,
,
,

Abstract

A model-based reinforcement learning framework is developed for sequential decision-making problems with asynchronous stochastic discrete events in continuous-time, using Hawkes processes to model environmental dynamics and embedding them in Bellman equations.

AI-generated summary

We consider a sequential decision making problem where the agent faces the environment characterized by the stochastic discrete events and seeks an optimal intervention policy such that its long-term reward is maximized. This problem exists ubiquitously in social media, finance and health informatics but is rarely investigated by the conventional research in reinforcement learning. To this end, we present a novel framework of the model-based reinforcement learning where the agent's actions and observations are asynchronous stochastic discrete events occurring in continuous-time. We model the dynamics of the environment by Hawkes process with external intervention control term and develop an algorithm to embed such process in the Bellman equation which guides the direction of the value gradient. We demonstrate the superiority of our method in both synthetic simulator and real-world problem.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2201.12569 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2201.12569 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.