02-11
Google and the Vapnik-Chervonenkis Dimension

Google engineers routinely train query classifiers, for ranking advertisements or search results, on more words than any human being sees or hears in a lifetime. A human being who sees a meaningfully new image every second for one-hundred years will not see as many images as Google has in its libraries, all of which are available for training object detectors and image classifiers. Yet by human standards the state-of-the-art, in computer understanding of language and computer-generated image analysis, is primitive. What explains the gap? Why can’t learning theory tell us how to make machines that learn as efficiently as humans? Upper bounds on the number of training samples needed to learn a classifier as rich and competent as the human visual system can be derived using the Vapnik-Chervonenkis dimension, or the metric entropy, but these suggest that not only does Google need more examples, but all of evolution might fall short. I will make some proposals for efficient learning and offer some mathematics to support them.

Date and Time

Wednesday February 11, 2009 4:15pm - 5:45pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

Colloquium

Speaker

Stuart Geman, from Brown University

Host

Fei-fei Li

Website

http://www.dam.brown.edu/people/geman/index.shtml

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

02-11 Google and the Vapnik-Chervonenkis Dimension

02-11
Google and the Vapnik-Chervonenkis Dimension