While all the other smartphone OEMS are pushing for more camera sensors on a smartphone, with the upcoming Samsung Galaxy S10 series tipped to come with as many as four rear and two front sensors, Google is keeping it simple with one sensor up front and one at the back. However, this doesn’t denigrate its photo taking skills, with reviewers applauding the camera on the phone especially the loaded AI-based software work that has gone behind it. The camera app comes with many features like Night Sight, Top Shot, Photomode, and more, and Google has now detailed how it manages to make the Top Shot feature work and all the processing that goes behind to bring that one good looking motion photo.
For all those unaware, Top Shot is an upgraded version of the Motion Photo feature found in older Pixel devices. When toggled to Auto or On, the camera captures a short video clip before and after you hit the shutter button, so you can go back in the Photos app, scrub through the clip and pick the better candid moment. This is pretty similar to Live Photos on the iPhone. When set to Auto, the camera will only capture the video clip if it detects motion or if some elements change in the frame.
Google explains that Top Shot captures 90 images from 1.5 seconds before and after the shutter press. It save twos alternative shots and the original shutter frame in high resolution, the latter saved first, and the former saved afterwards. The user can review these shots and decide which one to keep, and which one to discard. “Google’s Visual Core on Pixel 3₹ 59,944 is used to process these top alternative shots as HDR+ images with a very small amount of extra latency, and are embedded into the file of the Motion Photo,” the company adds on its AI blog.
The company developed a computer vision model that would let the Pixel 3 series decide on which alternative shots to take for Top Shot. It narrowed down three attribute – functional qualities like lighting, objective attributes (eyes open or closed, smile or frown, etc), and subjective qualities like emotional expressions.
“Our neural network design detects low-level visual attributes in early layers, like whether the subject is blurry, and then dedicates additional compute and parameters toward more complex objective attributes like whether the subject’s eyes are open, and subjective attributes like whether there is an emotional expression of amusement or surprise. We trained our model using knowledge distillation over a large number of diverse face images using quantization during both training and inference,” Google explains in its blog.
The tech giant says that while Top Shot prioritises analysing the face, it also takes into account those use cases, where the face isn’t the primary subject of concern.
Subject motion saliency score – the low-resolution optical flow between the current frame and the previous frame is. estimated in ISP to determine if there is salient object motion in the scene.
Global motion blur score – estimated from the camera motion and the exposure time. The camera motion is calculated from sensor data from the gyroscope and OIS (optical image stabilization).
“3A” scores – the status of auto exposure, auto focus, and auto white balance, are also considered.
Google tested this feature, and collected data from hundreds of volunteers, along with their opinions of which frames looked best. This goes on to show Google’s invested efforts with each camera feature on the new Pixel devices. The Night Sight feature has also been lauded, and you can check out our full review of the Pixel 3 here.