A photograph of strawberries comparing the effects of lossy compression

Modern web browsers accept four image formats by default: JPEG, GIF, PNG and . But before talking about formats, let’s introduce two extremely important terms: lossless and lossy compression.

Almost all images are compressed in some way. That is, the raw binary data that makes up the individual pixels and their color is packaged up and rearranged to achieve the smallest file size possible for any particular format. There are many different methods of data compression, but every system comes down to one simple question: whether it physically changes the data in the image.

Lossless compression does not change bits. It rearranges them, and tries to pack them into a smaller space - think of different arrangements of boxes in a moving van in an attempt to fit more inside - but does not physically change them. You obviously want a lossless compression scheme in situations for which fidelity to the original data is paramount. .zip is a ubiquitous compression scheme: bits go in, information is re-arranged and compressed, but the same bits come out after you uncompress the .zip. (You don't want the “Z’s” in a compressed Microsoft Word document to be changed to “k’s” just because it would make the file smaller.)

There are many lossless image compression formats: TIF, TGA, BMP, RAW, PNG, SVG and PSD among them. Arguably, even GIF is a lossless format. Assuming that you are feeding them the best information possible, all of those formats will preserve data completely, without loss or change. The easiest compression scheme is run-length encoding: if there are several pixels of the exact same color one after the other in a horizontal line, rather than counting them separately, GIF makes a shortcut code for them (say “five red pixels”, rather than counting “one red pixel, another red pixel…” and so on).

The major drawback to a lossless compression scheme (with the exception of SVG, which is predominantly a vector format) is file size. No matter how clever the algorithm, the data must be completely preserved. But what if we could change some of that data - squish it, alter it, or even throw it out - in such a way that the end user is unlikely to spot any changes?

This can’t be done with Word documents… but pixels are very small. If we can change some of them to be more like their neighboring pixels, we would increase the number of shortcuts we could take in describing the image, which in turn would reduce its file size. And that’s exactly what lossy compression does. There are a few lossy compression schemes for images; JPEG is the most well-known.

Lossy compression gives significant advantages in terms of file size. However, it comes with one major caveat: the changes made to the image to achieve this compression can't be undone. That is, the original information is lost, and can't be retrieved. (And no amount of digital wizardry can recover it - despite what the movies and television tell you).

Enjoy this piece? I invite you to follow me at twitter.com/dudleystorey to learn more.