July 27, 2016
Ever taken a selfie? Around the world, people snap tens of millions of these self-portraits every day, usually with a mobile device held at arm’s length. For all their raging popularity, though, selfies can often be misrepresentative, even unflattering. Due to the camera’s proximity, selfies render subjects’ noses larger, ears smaller and foreheads more sloping.
To tackle this issue, as well as explore the basic science of digital photo manipulation, Princeton researchers have unveiled a new method for transforming individual selfies. The method can modify a person’s face to look as though it were photographed from farther away, like at the distances opted for by professional photographers. The editing tool can also alter someone’s apparent pose, as if the camera were placed higher, lower, or at an angle. When superimposed, images adjusted in this manner can further be used to generate 3-D head shots. Down the road, the researchers said, it may even be possible to make “live” photos that seem to move uncannily, like the portraits hanging in the Hogwarts School from the Harry Potter franchise.
“Although it is the age of the selfie, many people are unaware of how much these self-portraits do not really look like the person being photographed because the camera is way too close,” said Ohad Fried, lead developer of the new method and a Ph.D. candidate in the Department of Computer Science at Princeton University. “Now that people can edit so many aspects of a photo right on their phones, we wanted to provide a quick way to edit faces that maintains realism.”
The project is the first of its kind to address the fixing of self-portrait distortions owing to camera distance, the researchers said.
“As humans, we have evolved to be very sensitive to subtle cues in other people's faces, so any artifacts or glitches in synthesized imagery tend to really jump out," said Adam Finkelstein, senior author of the paper and a professor of computer science. “With this new method, we therefore had to make sure the photo modifications looked extremely realistic, and we were frankly surprised at the fidelity of the results we were able to obtain starting from just a single photo.”
Fried, who is supported by a Google Graduate Fellowship, is presenting a paper describing the latest progress in the photo editing software technique July 28 at the SIGGRAPH 2016 conference in Anaheim, Calif., held by the Association for Computing Machinery (ACM). The paper will also appear in the journal ACM Transactions on Graphics.
In developing the method, Fried and colleagues began with a model for generating digital, 3-D human heads. The model came from FaceWarehouse, a database of 150 people photographed in 20 different poses, compiled by Zhejiang University researchers. The next ingredient, a program made available by researchers at Carnegie Mellon University, identified nearly six dozen reference points across someone’s face, such as the corners of the eyes, top of the head, chin, and so on, when presented with a selfie.
The Princeton-led, photo-editing method then adjusts the 3-D head model so that it optimally corresponds to the points detected on the face. In other words, the eyes in the 3-D model subsequently corresponded to where the subject’s eyes, for instance, were in the selfie. “Now we had an underlying 3-D model of the 2-D selfie image,” said Fried.
Modifying the selfie then proved straightforward. The selfie’s coordinates for facial reference points needed to be updated to match those in the 3-D image of a face, photographed either in a different pose or by a more distant camera. Presto, the 2-D image underwent a warp to approximate a desired change in its virtual 3-D orientation, and all within just a handful of seconds.
“I believe the reason the synthetic image looks so good is that it has exactly the same pixel colors as in the original photo—it's just that they have been moved around a little bit to provide the illusion that the camera had been in a different location,” said Finkelstein.
Ira Kemelmacher-Shlizerman, an assistant professor of computer science at the University of Washington who was not involved in the research, said “I love this paper” because it presents a “fantastic idea . . . the paper shows that head geometry can be 3-D manipulated for perspective and pose without an actual 3-D reconstruction of the input person. She noted that adjusting apparent camera distances could also let people forego the awkward, hand-held poles, known as selfie sticks, when trying to capture snazzier self-portraits. “The selfies application is very fun, which could bring an end to the selfie stick!” Kemelmacher-Shlizerman said.
Other applications besides correcting the distance perspective and poses of selfies include creating 3-D anaglyphs, Harry Potter-style “live” portraits, and editing frames of video.
Before potentially pursuing commercial development or release, Fried and colleagues want to first focus on honing their photo-editing tool. One remaining challenge area is hair. When warped in the same manner as other cranial characteristics, hair can look distorted because of its varied texture, styling and color. “Hair is tricky,” said Fried. Another problem: automatically synthesizing, or “hallucinating” a missing feature, such as a left or right ear, not visible due to a subject’s pose in an original picture, and which would appear to be missing when altering the subject’s pose in a modified image.
“We still have a lot of research to do,” said Fried. “We are happy with what we achieved so far, but we look forward to learning how we can make these selfie transformations appear even more realistic.”
The other collaborators on this project are Eli Shechtman of Adobe Research and Dan Goldman, now with Google but with Adobe Research during the bulk of the research. Princeton undergraduates Brian McSwiggen and John Morone, both computer science majors, have built an online demonstration of the new portrait manipulation method at the project’s web page, where additional information and a video may also be found: http://faces.cs.princeton.edu/.
This research was funded in part by Adobe and by a Google Graduate Fellowship.