So, what's so great about "raw" files?

Have you ever heard an enthusiastic photographer saying something like “I shoot everything in raw”? But what is a “raw” file? Sounds like raw meat, doesn’t it? Why do people use these?

Few people think about it twice, but pretty much every photo on the internet is stored as a JPEG file. This also happens to be the kind of file you get out of most digital cameras. In fact, most consumer-grade cameras can give nothing else but JPEG output. This is no coincidence: JPEG’s been around since 1992, and it turns out that it’s a really great file format for photographs. JPEG allows you to store a lot of image information in a reasonably small file, and is quick to decode and write. Unfortunately JPEG is a lossy standard, which means you always lose some image information when creating a JPEG.

Contrary to common belief, this “lossy” property is not the main reason to avoid using JPEG. The JPEG algorithm is actually really clever in the way it loses its information, meaning the human eye often can’t see the difference between a lossy JPEG and its lossless equivalent. Look at the seagull below to see what I mean.

A JPEG file compressed at 95% quality. (click for detail)
File size = 78 KB
Scarcely any degradation artefacts, despite the fact that a losslessly compressed PNG would have required more than 300 KB.
Compressed JPEG at an extremely low 30% quality. (click for detail)
File size = 9 KB.
Compression artefacts are visible (sky, detail in lantern glass), but the image remains perfectly recognizable due to the clever way JPEG works. And this is an extreme example.

No, the reasons why you should be interested in raw files are more subtle…

Let’s first think about how a camera transforms the light hitting its sensor into a  JPEG… Without going into too many details, and assuming a typical digital camera:

  • The photons hit the camera’s sensor, causing a electrical charge to accumulate on the  sensor’s (CCD or CMOS) photodiodes
  • The camera reads out the electrical charges, and converts the voltages to digital numbers by sending it through an analog-to-digital converter
  • The separate red, green and blue channels from the sensor’s Bayer grid are interpolated to give a RGB value at every pixel
  • White balance is applied, correcting the recorded colours by taking the ambient light into account
  • Various image processing algorithms are applied to sharpen the image, reduce noise, (and possibly) correct for distortion and chromatic aberration.
  • The linear pixel intensities are nonlinearly mapped to a tonal curve (similar to how the human eye perceives incoming light).
  • The intensity at every pixel is cast to an 8-bit value (the data originally received from the A/D converter is usually 12 or 14 bit)
  • The 8-bit RGB values are compressed with the JPEG algorithm
  • The metadata (time, camera settings, etc) are added to the JPEG file as an EXIF tag.
  • The file is written to the camera’s memory card

That’s a lot of processing. And it’s incredible that the circuitry in your camera can do this at something like 3 frames per second (or faster)!

Now, back to the question: what is a raw file? A raw file (usually) stores the image as it gets out of the A/D converter (the second step listed above), along with all the metadata (camera settings). The raw file stores this data losslessly, which means that it records all the information the camera has access to, before processing. Why is this better than JPEG?

It’s better because a JPEG file is a finished product, produced with the limited processing power available inside the camera itself. Which is only good as long as you like what you see. But what if the colours are wrong? Or the image needs more (or less) sharpening? Or if there are errors which need to be corrected? A raw file, on the other hand,  gives you the opportunity to pitch your powerful desktop PC and software against the data, exactly in the way you want.

Let’s say you have a dead pixel which gives a completely saturated value (see below). In the interpolation step this high value is spread over several adjacent image pixels. Sharpening and the lossy JPEG compression further diffuse this bright artefact. However, if we use software (in our case PixelFixer) to interpolate the hot pixel in the raw file itself, we completely remove the artefact by interpolating a single pixel only.

In JPG, a single green “hot pixel” shows itself as a bright white-green spot.
Enlargement shows  how de-mosaicking spreads the deviant green intensity value across multiple pixels.
Interpolating the single deviant pixel in the raw file, then converting to JPG, completely removes the artefact.

Then there’s dynamic range. JPEG throws data away when the conversion from 12 (or 14) bits to 8-bit is made. The camera might record details in shadow or highlights regions which end up clipped to flat black or white zones in the final JPEG. If you want to correct for over- or underexposure you get a lot more detail out of the RAW file than you can from a JPEG.

The JPEG image straight from my Nikon D80 looks okay, but the feathers are overexposed (click for detail)

Raw to the rescue! Processing the Nikon D80s NEF raw file we attain more detail, better exposure of shadows, and recovery of highlight details on the feathers! (click for detail)

Also, if the camera’s white balance is waaay of it’s difficult to fix from the JPEG file. Photoshop and Picasa have tools for selecting a (supposed) neutral gray or white patch, and corrects the colour, but his doesn’t always work. In a raw file there isn’t any white balance which has been applied yet, so you can change it as you please. (This is great for underwater photography where colour correction is very tricky)

Oh noes! I forgot the camera on a wrong white balance setting! The result is a very weird blue landscape. What to do?
Using Picasa’s “auto colour” or “neutral colour picker” the JPEG improves slightly, but the grass is still bluish and the clouds have a strange reddish cast.

Using raw, we can perfectly correct the mistaken white-balance - after the photo is taken!

Since your computer’s CPU is a lot faster than your camera, and you have almost no memory or time limit, the image processing algorithms of PC-based raw converters can be a lot more complex than those in your camera. This means that JPEGs you create on your computer look better than the ones your camera makes. They show more detail and handle noise better.

Almost every camera maker has their own raw file format. This tower of Babel includes CRW and CR2 (Canon), NEF (Nikon), PEF (Pentax and Samsung), SR2 and ARW (Sony), RAF (Fuji), RWL (Leica), X3F (Sigma), RAW and RW2 (Panasonic). Adobe tried to improve the situation by creating an openly available standard called DNG (digital negative), but this has so far only found limited acceptance.

If you buy a camera which can take raw images you usually get raw conversion software for Windows and Mac. Linux users are usually out of luck, unless they buy Adobe Photoshop. Google’s Picasa and some other tools like Helicon Filter use the free and open UFRaw toolbox, but this often leads to not-so-great results requiring a lot of manual tweaking to get good colours and sharpness (not recommended).

I’ve come across cameras which do such a good job of JPEG conversion that the results are difficult to match, let alone improve, using the PC-based software. These cameras include the venerable Canon S45 and the new Panasonic Lumix LX3. But even so, the in-camera JPEG conversion gives one no ability to make any significant changes afterwards. In this case JPEG is only the best solution using default settings.

But JPEGs are going to stay with us, don’t worry. Since most raw file formats can only be properly decoded with the manufacturer’s software (and not directly uploaded to, say, facebook), it is still best to eventually create a JPEGs when you want to print or share your photographs. JPEGs are also smaller and faster to display. But if you want to store your best photographs for future editing it’s best to keep them saved as raw files also.

In conclusion, I’ll repeat the (true) cliché: RAW files are like negatives, and JPEGs are like prints. RAW files are the means by which we get to make our beautiful JPGs.