There is a general move towards software-defined camera on smartphones, rather than just relying on fixed hardware, according to Marc Levoy, distinguished engineer at Google and currently leading the team responsible for modes like HDR+, portrait mode on the Pixel smartphones.
In an interaction via a video-conference call, Levoy said the trend in camera software was around combining bursts of frames, or what is called computational photography. Google is not the only player relying on computational photography, so is Apple with the iPhone XS, iPhone XR.
Computational photography on smartphones essentially combines a number of frames at various exposure levels to ensure the best possible result for a photograph. “There is also a move towards machine learning, which is replacing traditional algorithm. In machine learning you have better training data, which results in better accuracy on the tasks,” Levoy explained.
With Pixel phones, Google has shown that smartphone photography does not really need a lot of extra hardware in order to deliver excellent results. The Pixel 3, like the Pixel 2, continues with a single rear camera, which also supports Portrait mode.
What is new this year is that Google had added features like Super Zoom, Night Sight mode, in order to improve photography.
However, Levoy does point out that just because software is improving when it comes to taking better photos, it means hardware will be irrelevant. “The hardware is separate from computational photography. Yes, it will always matter. For instance, the aperture will control how much SuperZoom you can get. If the lenses used are not good, there will be aberrations with the software. Optics do matter,” he said on the subject.
On Night Sight mode, Levoy explained the camera captures up to 15 frames after the shutter is pressed. However, Night Sight also requires a user to be extra still while taking a picture. “The exposure really varies on how still you are standing. If you are very still, or put the phone on a tripod, it will lengthen the exposure,” he explained.
On the 15-frame capture, Levoy said this was an arbitrary choice of number, but if the camera detects that the user is very still or the phone is on a tripod, it will reduce the number of frames, and instead increase the exposure length of each shot. Of course, this also means that the final image takes that extra few seconds to be processed and shown to the user. “Night Sight is also taking into account motion metering or movement in the photo in order to ensure the best possible result,” he pointed out.
Levoy also mentioned that Night Sight was a fairly late development for the Pixel 3 phones. Officially, the feature was rolled out nearly one month after the Pixel 3 launch.
While the software-driven approach has worked for Google in delivering excellent results, there are trade-offs. For instance, in the Pixel phones, the Portrait mode is not a live bokeh like with other phones. Instead, the Portrait mode image takes a few seconds to be processed and then displayed after the user has clicked the shot. The reason why Pixel is not implementing a live bokeh is that the hardware is still not fast enough, according to Levoy.
“I don’t know how fast the hardware would have to be. We could try to implement it on the CPU or specialised hardware. However, there are advantages with our software-centric approach. We can respond quickly, and update it after we have launched. Of course, the trade-off is that we are not quite as efficient with our software computation as we would be with a hardware approach,” he admitted.
In fact, the software approach is also powered Google Pixel 3’s Super Zoom feature, rather than relying on a 2X or 4X optical zoom lens like other players. So what about secondary lenses on future Pixel phones? After all the front camera on Pixel 3 now has a wide-angle lens as well. While Levoy said he could not comment on future products, he did admit that if there was another sensor, one could get higher zoom resolution ratios.
“The fact that we had only one camera this year, certainly motivated us to work very hard on the computational SuperZoom,” he said.
Software might be increasingly driving still photography on phones, but when it comes to video this is still an untapped territory and needs work.
Levoy said the main issue when relying on a similar software-driven technique for video was the kind of computation time and powered needed. “If you want to try and do these things responsively in software, then you need compute power. The limitation on computational video is the compute power,” he said.