Teaching Machines to Learn by Themselves

February 1, 2018

By Doug Hulette

Picture yourself trying to build a machine to detect email spam. You might start with simple rules that identify key words such as “drugs,” for instance. Of course, some legitimate emails could contain those key words, so you add rules to take into account these cases. In an adaptive world, advertisers may quickly learn the rules of your system and develop ways to fool your spam detector.

Painstakingly identifying patterns and continuously updating your system by hand is a daunting challenge — even more so for tasks such as identifying one human face out of more than 7 billion, or driving a car through chaotic Manhattan traffic.

But what if you could set an environment and ground rules for machines to teach themselves and constantly improve their performance as more data rolls in? Now you’ve entered the world of Elad Hazan.

Professor Elad Hazan — *Photo by David Kelly Crow*

Hazan, who joined the Computer Science Department in 2015 and became a full professor in 2016, focuses on enabling machines to learn—to teach themselves, so to speak—a core aspect of artificial intelligence. Doing so is a tall order in a world where the rules of cause and effect can be overwhelmingly intricate. Success requires developing algorithms and implementing them efficiently, rigorously proving mathematical performance guarantees, and incorporating techniques from related fields such as game theory, statistics and computational complexity.

“In contrast to neuroscientists, who attempt to understand the brain, we try to build learning machines without restricting ourselves to human biological considerations,” Hazan explains. “Perhaps a good analogy would be aviation scientists, who try to build aircraft that are potentially very different from birds.”

One product of his work is the optimization algorithm AdaGrad, which is used in training deep neural networks (the deeper a circuit is, the more difficult problems it can apply to, but also the harder it is to train or optimize) that are used for image classification and text translation. Think image facial and fingerprint recognition systems, language translation services like Google Translate, recommendation systems that find relevant music/films based on prior preference, and programs that categorize news articles into groups.

Hazan explains: “In machine learning we take data, such as images taken by a cellphone camera, and build an automatic program that can identify future images of the same objects versus others. This ‘machine’ can be thought of as an electronic circuit, and one needs to set the connections and/or weights to achieve the desired functionality, whether it be to identify images, translate between languages, or other applications.

“An optimization algorithm can be thought of as a computer program that takes input data, and, based upon the data, sets the weights/connections of the circuit to achieve the desired functionality. It is a crucial component of machine learning.

“AdaGrad, which we invented about a decade ago, is the basis for a common optimizer of the Google TensorFlow platform. Every day, millions of machines use the algorithm to perform computations ranging from training translation engines to classifying images.”

Hazan earned his Ph.D. from Princeton in 2006. Before joining the CS faculty, he was an associate professor of operations research at Technion—Israel Institute of Technology. In mid-2017, he founded a company called In8 based on research he did at Princeton and in partnership with Jacob Abernethy, a computer science professor at the University of Michigan. Hazan and members of the In8 team recently struck a deal to join Google.

“In8 centers on two focus areas,” he says. “The first is a method for time-series prediction, such as predicting the location of people in a street scene. This has applications in automotive research, among other places. The second involves efficient optimization methods that aid in faster training of deep neural networks.”

The In8 team, summer 2017 (left to right): Professor Jacob Abernethy of Georgia Tech; 2017 Princeton alumna Xinyi Chen; University of Michigan graduate student Chengyu Dai; Princeton graduate students Yi Zhang , Brian Bullins, Karan Singh, and Cyril Zhang; professor Elad Hazan; and Princeton senior Alexandra Vogelsang

In an email exchange, Hazan discussed his research and his passion for machine learning:

What will the focus of the In8 team be within Google?

We’ll continue to work on our focus areas: optimization of non-convex models required for deep learning, that is, those for which optimization is far more challenging because we don't have provably efficient methods; and predictive control for robotics systems such as, for instance, using signals such as velocity, momentum, acceleration and position to tell a vehicle’s steering wheel, gas pedal and brakes what to do. We’re excited about the opportunity to continue collaborating with the CS department and also leverage additional resources that both Princeton and Google have to offer.

Will you continue to be involved with Princeton?

Yes, during the coming year I’ll take a sabbatical to work on our focus areas. During this year I’ll continue my academic duties including advising graduate and undergraduate students. I’ll resume teaching my favorite course, “Theoretical machine learning”, Spring of 2019.

Was this your first company?

This was my first entrepreneurial experience. It was very humbling. I have tremendous respect for entrepreneurs that succeed in creating a successful company that impacts the world.

Do you consider yourself an entrepreneur? a scientist? both?

Definitely a scientist first. The venture into the business world came out of a desire to make a broader impact and apply our theoretical inventions. I believe that together with Google we will have the best opportunity to do exactly that.

Do you plan to launch another company?

I don't have such plans for the near future. I believe that I can contribute the most in “theory that matters” — developing mathematical ideas and algorithms that can be used in applications and advance our society.

What do you see as the future of machine learning, from society's perspective?

It is a very exciting time. In the short term, we are going to see self-driving cars and delivery drones that will change society at its core. It is risky to predict beyond five years, but at the rate at which academia, industry and government are investing in AI, we will go beyond machine learning to more intelligent systems that can fully complement and enhance human performance in all fields. It is only a question of when.

Elad Hazan's book, which contains AdaGrad, can be found here: Introduction to Online Convex Optimization
Information about Elad's research groups can be found here: Theoretical Machine Learning group, Optimization and Machine Learning group