CMU-CS-98-152 Computer Science Department School of Computer Science, Carnegie Mellon University
Learning Evaluation Functions for Global Optimization Justin Andrew Boyan August 1998 Ph.D. Thesis
First, it presents three new VFA algorithms, which apply to three different
restricted classes of sequential decision problems: Grow-Support for
deterministric problems, ROUT for acyclic stochastic problems, and
Least-Square TD
Second, it introduces STAGE, a new search algorithm for general combinatorial
optimization tasks. STAGE learns a problem-specific heuristic evaluation
function as it searches. The heuristic is trained by supervised linear
regression or Least-Squares TF (lambda) to predict, from features of states
along the search trajectory, how well a fast local search method such as
hillclimbing will perform starting from each state. Search proceeds by
alternating between two stages: performing the fast search to gather new
training data, and following the learned heuristic to identify a promising
new state state.
STAGE has produced good results (in some cases, the best results known)
on a variety of combinatorial optimization domains, including VLSI channel
routing, Bayes net structure-finding, bin-packing, Boolean satisfiability,
radiotherapy treatment planning, and geographic cartogram design. This
thesis describes the results in detail, analyzes the reasons for and
conditions of STAGE's success, and places STAGE in the context of four
decades of research in local search and evaluation function learning. It
provides strong evidence that reinforcement learning methods can be
efficient and effective on large-scale decision problems.
218 pages
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |