Brains Beat Algorithms at Image Compression

If file sizes are given limits, the human brain outperforms computers at compressing the data needed to recreate an image.

Stephen J. Mraz

April 11, 2019

4 min read

Add Us On Google

Given the image on the left, two study participants made the reconstruction on the right. People preferred their reconstruction to the image at the center, a highly compressed version of the original with a file size equal to the amount of data the participants used to make their reconstruction.

Your friend texts you a photo of the dog she’s about to adopt, but all you see is a tan, vaguely animal-shaped haze of pixels. To get you a clearer picture, she sends a link to the dog’s adoption profile because she’s worried about her data limit. One click and your screen fills with much more satisfying descriptions and images of her new best friend.

Sending a link instead of uploading a massive image is just one trick humans use to send information without sending too much data. In fact, this and other tricks might inspire an entirely new class of image-compression techniques, according to a team of Stanford University engineers and high school students.

The researchers asked people to compare images produced by a traditional compression algorithm that computers use to shrink huge images into pixilated blurs to those created by humans in data-restricted conditions—text-only communication, which could include links to public images. In many cases, the products of human-powered image sharing proved more satisfactory than the algorithm’s work.

“Almost every image compressor we have today is evaluated using metrics that don’t necessarily represent what humans value in an image,” says Stanford researcher Irena Fischer-Hwang. “It turns out our computer algorithms have a long way to go and can learn a lot from the way humans share information.”

The project resulted from a collaboration between researchers led by Tsachy Weissman, professor of electrical engineering, and three high school students who interned in his lab.

“Honestly, we came into this collaboration aiming to give the students something that wouldn’t distract too much from ongoing research,” says Weissman. “But they wanted to do more, and that chutzpah led to a paper and a whole new research thrust for the group. This could very well become among the most exciting projects I’ve ever been involved in.”

Converting images into a compressed format, such as a JPEG, makes them significantly smaller, but some detail is lost in the process. This form of conversion is often called “lossy” for that reason. The resulting image is lower-quality because the algorithm has to sacrifice details about color and luminance so it would to consume less data. Although the algorithms retain enough detail for most cases, Weissman’s interns thought they could do better.

In their experiments, two students worked together remotely to recreate images using free photo editing software and public images from the internet. One person in the pair had the reference image and guided the second person in reconstructing the photo. Both people could see the reconstruction being done, but the describer could only communicate over text while listening to their partner speaking.

The eventual file size of the reconstructed image was the compressed size of the text messages sent by the describer because that’s what would be required to recreate that image. (The group didn’t include audio information.)

The students then pitted the human reconstructions against machine-compressed images with file sizes that equaled those of the reconstruction text files. So, if a human team created an image with only 2 kilobytes of text, it used a computer to compress the original file to 2 kilobytes. With access to the original images, 100 people outside the experiments rated the human reconstruction better than the machine-based compression on 10 out of 13 images.

When the original images closely matched public images on the internet, such as a street intersection, the human-made reconstructions performed particularly well. Even reconstructions that combined various images often did well, except in cases that featured human faces. The researchers didn’t ask their judges to explain their rankings, but they have some ideas about the disparities found.

“In some scenarios, such as nature scenes, people didn’t mind if the trees were a little different or the giraffe was a different giraffe; they cared more about them not being blurry, which means traditional computer compression ranked lower,” says researcher Shubham Chandak. “But for human faces, people would rather have the same face even if it’s blurry.”

This apparent weakness in human-based image sharing would improve as more people upload images of themselves to the internet. The researchers are also teaming with a police sketch artist to see how his expertise might make a difference. Even though this work shows the value of human input, the researchers would eventually try to automate the process.

“Machine learning is working on bits and parts of this, and hopefully we can get them working together soon,” says researcher Kedar Tatwawadir. “It seems like a practical compressor that works with this kind of ideology is not very far away.”