By Julia Schwarz
Engineers at Princeton have built a robot to help with that most essential household task: tidying up.
Robots excel at simple tasks like moving a pile of objects from the floor into a bin. But real-world household cleanup requires more complicated skills. Robots must distinguish among different objects, place them at designated locations, and avoid breaking them.
“You might throw a t-shirt into the laundry pile, but there are some things you really shouldn’t throw, like dishes into a sink,” said Szymon Rusinkiewicz, David M. Siegel '83 Professor of Computer Science and one of three senior authors on a research paper about the new robot, called TidyBot.
Tidying up requires a robot with three distinct skills: physical dexterity, visual recognition, and some amount of common sense. The researchers achieved this with TidyBot by combining a mobile robotic arm with a vision model and a large language model.
TidyBot’s arm is attached to square base that scoots around on four wheels, and its hand is a pincer that can pick up objects and open drawers. This gives it the physical ability to carefully place dishes in the sink or throw t-shirts on the laundry pile. A camera paired with a vision model lets it distinguish between types of objects, like a pair of pants and a banana peel. Finally, by programming it with a large language model, TidyBot can learn to gently place a wine glass on a shelf while casually tossing a stuffed animal into the toy bin.
In testing, TidyBot correctly determined the category of objects and where to put them — as in, place toys in the box and put clothes on floor — for 85% of objects. “We were quite surprised by the results,” said Jimmy Wu, the paper’s first author, “because the language and vision models we used are very general and not specifically trained for this type of task.”
TidyBot’s success is especially noteworthy, Rusinkiewicz added, because it can infer and generalize from so few instructions. “This is in contrast to general machine learning,” he said, “where you need lots of examples for everything.” Creating an exhaustive list of instructions for household cleanup would require significant effort and constant maintenance, limiting the robot’s real-world usefulness.
While very good at distinguishing between categories of things, like toys and clothes, TidyBot is not as good yet at sorting objects based on attributes, like metal versus plastic. It also has trouble with subcategories, like shirts versus other clothes. Fine-tuning this is one possible area of future research, said Wu.
Other areas of future research could involve building specialized robots for tasks that require particularly complex skills, like washing dishes, or using multiple robots to work together. “You can imagine a task where one robot picks up an object, another one opens the drawer, another one makes space in the drawer, and then another one drops the thing in,” Rusinkiewicz said. “The type of task that really requires cooperation.”
The paper, “TidyBot: Personalized Robot Assistance with Large Language Models,” will be presented on October 2, 2023 at the International Conference on Intelligent Robots and Systems held in Detroit, Michigan. In addition to Rusinkiewicz, the two other senior authors on the paper are Thomas Funkhouser, emeritus professor of computer science at Princeton and Research Scientist at Google, and Jeannette Bohg, assistant professor for robotics at Stanford University. Other co-authors include Rika Antonova and Marion Lepert at Stanford University, Andy Zeng at Google, Shuran Song at Columbia University, and Adam Kan at the Nueva School in California. The work was supported by the Princeton School of Engineering and Applied Science, Toyota Research Institute and the National Science Foundation.