Marginal Utility for Planning in Continuous or Large Discrete Action Spaces. (arXiv:2006.06054v1 [cs.AI])

Sample-based planning is a powerful family of algorithms for generating
intelligent behavior from a model of the environment. Generating good candidate
actions is critical to the success of sample-based planners, particularly in
continuous or large action spaces. Typically, candidate action generation
exhausts the action space, uses domain knowledge, or more recently, involves
learning a stochastic policy to provide such search guidance. In this paper we
explore explicitly learning a candidate action generator by optimizing a novel
objective, marginal utility. The marginal utility of an action generator
measures the increase in value of an action over previously generated actions.
We validate our approach in both curling, a challenging stochastic domain with
continuous state and action spaces, and a location game with a discrete but
large action space. We show that a generator trained with the marginal utility
objective outperforms hand-coded schemes built on substantial domain knowledge,
trained stochastic policies, and other natural objectives for generating
actions for sampled-based planners.

Source link

Related posts

How genomic testing can transform patient care


Data Mining


Google Home teams up with UK retailer for voice shopping service


This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy


COVID-19 (Coronavirus) is a new illness that is having a major effect on all businesses globally LIVE COVID-19 STATISTICS FOR World