Statistical estimation problems arise naturally in many
areas of machine intelligence. In this talk, we consider
problems of this kind from two areas of machine learning:
supervised learning, where the goal is to learn to make
decisions in an i.i.d. setting, and reinforcement learning,
where the goal is to learn to make a sequence of related
decisions.
In prediction problems, such as pattern classification and
regression, estimates of prediction error are important
for the analysis and design of learning algorithms.
We first review classical error estimates, and then describe
more recent `large margin' estimates, which give a
better explanation of the success of some of the most
popular pattern classification techniques. All of these
estimates measure the complexity of a function class
without exploiting any information about the process
that generated the data. We describe recent work on
data-dependent error estimates, which can be much more
accurate because they use the training data to capture
important properties of this process.
In reinforcement learning problems, an agent chooses
actions to take in some environment, aiming to
maximize a reward function. Many control, scheduling,
optimization and game-playing tasks can be formulated in
this way. Policy gradient methods consider agents that
are restricted to some set of policies, and aim to move
through policy space so as to improve the performance of
the agent. The central problem for such methods is that
of estimating performance gradients. We present algorithms
for this problem, and show how their performance depends on
properties of the controlled system, such as mixing times.
Date and Time
Wednesday April 10, 2002 4:00pm -
5:30pm
Location
Computer Science Small Auditorium (Room 105)
Event Type
Speaker
Peter Bartlett, from BIOwulf Technologies and Australian National University
Host
Bernard Chazelle