Stochastic enforced hill-climbing

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Enforced hill-climbing is an effective deterministic hillclimbing technique that deals with local optima using breadth-first search (a process called "basin flooding"). We propose and evaluate a stochastic generalization of enforced hill-climbing for online use in goal-oriented probabilistic planning problems. We assume a provided heuristic function estimating expected cost to the goal with flaws such as local optima and plateaus that thwart straightforward greedy action choice. While breadth-first search is effective in exploring basins around local optima in deterministic problems, for stochastic problems we dynamically build and solve a local Markov-decision process model of the basin in order to find a good escape policy exiting the local optimum. We evaluate our proposal in a wide range of recent probabilistic planning-competition benchmark domains. For evaluation, we show that stochastic enforced hill-climbing produces better policies than greedy action choice for value functions derived in two very different ways. First, we propose a novel heuristic function derived from the ideas in the effective re-planner FF-Replan. This new "controlled-randomness FF heuristic" is the deterministic FF heuristic computed on the simple determinization of the probabilistic problem that makes available a deterministic transition wherever a probabilistic transition was possible. Our results show that stochastic enforced hill-climbing with this heuristic significantly outperforms simply greedily following the heuristic, and also substantially outperforms FF-Replan. We additionally evaluate our technique on automatically learned value functions that on their own perform at the state-of-the-art when used to construct a greedy policy, and again show significant improvement over greedy action selection.

Original languageEnglish
Title of host publicationICAPS 2008 - Proceedings of the 18th International Conference on Automated Planning and Scheduling
Pages396-403
Number of pages8
StatePublished - 2008
Externally publishedYes
Event18th International Conference on Automated Planning and Scheduling, ICAPS 2008 - Sydney, NSW, Australia
Duration: Sep 14 2008Sep 18 2008

Publication series

NameICAPS 2008 - Proceedings of the 18th International Conference on Automated Planning and Scheduling

Conference

Conference18th International Conference on Automated Planning and Scheduling, ICAPS 2008
Country/TerritoryAustralia
CitySydney, NSW
Period09/14/0809/18/08

Fingerprint

Dive into the research topics of 'Stochastic enforced hill-climbing'. Together they form a unique fingerprint.

Cite this