How Colour Works

Colour is a powerful thing. It can identify a brand, imply eco-friendliness, gender a toy, raise our blood pressure, calm us down. But what exactly is colour? How and why do we see it? And how do cameras record it? Let’s find out.

 

The Meaning of “Light”

One of the many weird and wonderful phenomena of our universe is the electromagnetic wave, an electric and magnetic oscillation which travels at 186,000 miles per second. Like all waves, EM radiation has the inversely-proportional properties of wavelength and frequency, and we humans have devised different names for it based on these properties.

The electromagnetic spectrum

EM waves with a low frequency and therefore a long wavelength are known as radio waves or, slightly higher in frequency, microwaves; we used them to broadcast information and heat ready-meals. EM waves with a high frequency and a short wavelength are known as x-rays and gamma rays; we use them to see inside people and treat cancer.

In the middle of the electromagnetic spectrum, sandwiched between infrared and ultraviolet, is a range of frequencies between 430 and 750 terahertz (wavelengths 400-700 nanometres). We call these frequencies “light”, and they are the frequencies which the receptors in our eyes can detect.

If your retinae were instead sensitive to electromagnetic radiation of between 88 and 91 megahertz, you would be able to see BBC Radio 2. I’m not talking about magically seeing into Ken Bruce’s studio, but perceiving the FM radio waves which are encoded with his silky-smooth Scottish brogue. Since radio waves can pass through solid objects though, perceiving them would not help you to understand your environment much, whereas light waves are absorbed or reflected by most solid objects, and pass through most non-solid objects, making them perfect for building a picture of the world around you.

Within the range of human vision, we have subdivided and named smaller ranges of frequencies. For example, we describe light of about 590-620nm as “orange”, and below about 450nm as “violet”. This is all colour really is: a small range of wavelengths (or frequencies) of electromagnetic radiation, or a combination of them.

 

In the eye of the beholder

Scanning electron micrograph of a retina

The inside rear surfaces of your eyeballs are coated with light-sensitive cells called rods and cones, named for their shapes.

The human eye has about five or six million cones. They come in three types: short, medium and long, referring to the wavelengths to which they are sensitive. Short cones have peak sensitivity at about 420nm, medium at 530nm and long at 560nm, roughly what we call blue, green and red respectively. The ratios of the three cone types vary from person to person, but short (blue) ones are always in the minority.

Rods are far more numerous – about 90 million per eye – and around a hundred times more sensitive than cones. (You can think of your eyes as having dual native ISOs like a Panasonic Varicam, with your rods having an ISO six or seven stops faster than your cones.) The trade-off is that they are less temporally and spatially accurate than cones, making it harder to see detail and fast movement with rods. However, rods only really come into play in dark conditions. Because there is just one type of rod, we cannot distinguish colours in low light, and because rods are most sensitive to wavelengths of 500nm, cyan shades appear brightest. That’s why cinematographers have been painting night scenes with everything from steel grey to candy blue light since the advent of colour film.

The spectral sensitivity of short (blue), medium (green) and long (red) cones

The three types of cone are what allow us – in well-lit conditions – to have colour vision. This trichromatic vision is not universal, however. Many animals have tetrachromatic (four channel) vision, and research has discovered some rare humans with it too. On the other hand, some animals, and “colour-blind” humans, are dichromats, having only two types of cone in their retinae. But in most people, perceptions of colour result from combinations of red, green and blue. A combination of red and blue light, for example, appears as magenta. All three of the primaries together make white.

Compared with the hair cells in the cochlea of your ears, which are capable of sensing a continuous spectrum of audio frequencies, trichromacy is quite a crude system, and it can be fooled. If your red and green cones are triggered equally, for example, you have no way of telling whether you are seeing a combination of red and green light, or pure yellow light, which falls between red and green in the spectrum. Both will appear yellow to you, but only one really is. That’s like being unable to hear the difference between, say, the note D and a combination of the notes C and E. (For more info on these colour metamers and how they can cause problems with certain types of lighting, check out Phil Rhode’s excellent article on Red Shark News.)

 

Artificial eye

A Bayer filter

Mimicking your eyes, video sensors also use a trichromatic system. This is convenient because it means that although a camera and TV can’t record or display yellow, for example, they can produce a mix of red and green which, as we’ve just established, is indistinguishable from yellow to the human eye.

Rather than using three different types of receptor, each sensitive to different frequencies of light, electronic sensors all rely on separating different wavelengths of light before they hit the receptors. The most common method is a colour filter array (CFA) placed immediately over the photosites, and the most common type of CFA is the Bayer filter, patented in 1976 by an Eastman Kodak employee named Dr Bryce Bayer.

The Bayer filter is a colour mosaic which allows only green light through to 50% of the photosites, only red light through to 25%, and only blue to the remaining 25%. The logic is that green is the colour your eyes are most sensitive to overall, and that your vision is much more dependent on luminance than chrominance.

A RAW, non-debayered image

The resulting image must be debayered (or more generally, demosaiced) by an algorithm to produce a viewable image. If you’re recording log or linear then this happens in-camera, whereas if you’re shooting RAW it must be done in post.

This system has implications for resolution. Let’s say your sensor is 2880×1620. You might think that’s the number of pixels, but strictly speaking it isn’t. It’s the number of photosites, and due to the Bayer filter no single one of those photosites has more than a third of the necessary colour information to form a pixel of the final image. Calculating that final image – by debayering the RAW data – reduces the real resolution of the image by 20-33%. That’s why cameras like the Arri Alexa or the Blackmagic Cinema Camera shoot at 2.8K or 2.5K, because once it’s debayered you’re left with an image of 2K (cinema standard) resolution.

 

colour Compression

Your optic nerve can only transmit about one percent of the information captured by the retina, so a huge amount of data compression is carried out within the eye. Similarly, video data from an electronic sensor is usually compressed, be it within the camera or afterwards. Luminance information is often prioritised over chrominance during compression.

Examples of chroma subsampling ratios

You have probably come across chroma subsampling expressed as, for example, 444 or 422, as in ProRes 4444 (the final 4 being transparency information, only relevant to files generated in postproduction) and ProRes 422. The three digits describe the ratios of colour and luminance information: a file with 444 chroma subsampling has no colour compression; a 422 file retains colour information only in every second pixel; a 420 file, such as those on a DVD or BluRay, contains one pixel of blue info and one of red info (the green being derived from those two and the luminance) to every four pixels of luma.

Whether every pixel, or only a fraction of them, has colour information, the precision of that colour info can vary. This is known as bit depth or colour depth. The more bits allocated to describing the colour of each pixel (or group of pixels), the more precise the colours of the image will be. DSLRs typically record video in 24-bit colour, more commonly described as 8bpc or 8 bits per (colour) channel. Images of this bit depth fall apart pretty quickly when you try to grade them. Professional cinema cameras record 10 or 12 bits per channel, which is much more flexible in postproduction.

CIE diagram showing the gamuts of three video standards. D65 is the standard for white.

The third attribute of recorded colour is gamut, the breadth of the spectrum of colours. You may have seen a CIE (Commission Internationale de l’Eclairage) diagram, which depicts the range of colours perceptible by human vision. Triangles are often superimposed on this diagram to illustrate the gamut (range of colours) that can be described by various colour spaces. The three colour spaces you are most likely to come across are, in ascending order of gamut size: Rec.709, an old standard that is still used by many monitors; P3, used by digital cinema projectors; and Rec.2020. The latter is the standard for ultra-HD, and Netflix are already requiring that some of their shows are delivered in it, even though monitors capable of displaying Rec.2020 do not yet exist. Most cinema cameras today can record images in Rec.709 (known as “video” mode on Blackmagic cameras) or a proprietary wide gamut (“film” mode on a Blackmagic, or “log” on others) which allows more flexibility in the grading suite. Note that the two modes also alter the recording of luminance and dynamic range.

To summarise as simply as possible: chroma subsampling is the proportion of pixels which have colour information, bit depth is the accuracy of that information and gamut is the limits of that info.

That’s all for today. In future posts I will look at how some of the above science leads to colour theory and how cinematographers can make practical use of it.

SaveSave

How Colour Works

Camerimage 2017: Wednesday

This is the third and final part of my report from my time at Camerimage, the Polish film festival focused on cinematography. Read part one here and part two here.

 

Up.Grade: Human Vision & Colour Pipelines

I thought I would be one of the few people who would be bothered to get up and into town for this technical 10:15am seminar. But to the surprise of both myself and the organisers, the auditorium of the MCK Orzeł was once again packed – though I’d learnt to arrive in plenty of time to grab a ticket.

Up.grade is an international colour grading training programme. Their seminar was divided into two distinct halves: the first was a fascinating explanation of how human beings perceive colour, by Professor Andrew Stockman; the second was a basic overview of colour pipelines.

Prof. Stockman’s presentation – similar to his TED video above – had a lot of interesting nuggets about the way we see. Here are a few:

  • Our eyes record very little colour information compared with luminance info. You can blur the chrominance channel of an image considerably without seeing much difference; not so with the luminance channel.
  • Light hitting a rod or cone (sensor cells in our retinae) straightens the twist in the carbon double bond of a molecule. It’s a binary (on/off) response and it’s the same response for any frequency of light. It’s just that red, green and blue cones have different probabilities of absorbing different frequencies.
  • There are no blue cones in the centre of the fovea (the part of the retina responsible for detailed vision) because blue wavelengths would be out of focus due to the terrible chromatic aberration of our eyes’ lenses.
  • Data from the rods and cones is compressed in the retina to fit the bandwidth which the optical nerve can handle.
  • Metamers are colours that look the same but are created differently. For example, light with a wavelength of 575nm is perceived as yellow, but a mixture of 670nm (red) and 540nm (green) is also perceived as yellow, because the red and green cones are triggered in the same way in both scenarios. (Isn’t that weird? It’s like being unable to hear the difference between the note D and a combination of the notes C and E. It just goes to show how unreliable our senses really are.)
  • Our perception of colour changes according to its surroundings and the apparent colour of the lighting – a phenomenon perfectly demonstrated by the infamous white-gold/blue-black dress.

All in all, very interesting and well worth getting out of bed for!

At the end of the seminar I caught up with fellow DP Laura Howie, and her friend Ben, over coffee and cake. Then I sauntered leisurely to the Opera Nova and navigated the labyrinthine route to the first-floor lecture theatre, where I registered for the imminent Arri seminar.

 

Arri Seminar: International Support Programme

After picking up my complementary Arri torch, which was inexplicably disguised as a pen, I bumped into Chris Bouchard. Neither of us held high hopes that the Support Programme would be relevant to us, but we thought it was worth getting the lowdown just in case.

Shooting “Kolkata”

The Arri International Support Programme (ISP) is a worldwide scheme to provide emerging filmmakers with sponsored camera/lighting/grip equipment, postproduction services, and in some cases co-production or sales deals as well. Mandy Rahn, the programme’s leader, explained that it supports young people (though there is no strict age limit) making their first, second or third feature in the $500,000-$5,000,000 budget range. They support both drama and documentary, but not short-form projects, which ruled out any hopes I might have had that it could be useful for Ren: The Girl with the Mark.

Having noted these keys details, Chris and I decided to duck out and head elsewhere. While Chris checked out some cameras on the Canon stand, I had a little chat with the reps from American Cinematographer about some possible coverage of The Little Mermaid. We then popped over to the MCK and caught part of a Canon seminar, including a screening of the short documentary Kolkata. Shortly we were treading the familiar path back to the Opera Nova and the first-floor lecture theatre for a Kodak-sponsored session with Ed Lachman, ASC, only to find it had been cancelled for reasons unknown.

 

Red Seminar: High resolution Image Processing Pipeline

Next on our radar was a Red panel. I wasn’t entirely sure if I could handle another high resolution seminar, but I suggested we return once more to the MCK anyway and relax in the bar with one eye on the live video feed. Unfortunately we got there to find that the monitors had disappeared, so we had to go into the auditorium, where it was standing room only.

“GLOW” – DP: Christian Sprenger

Light Iron colourist Ian Vertovec was talking about his experience grading the Netflix series GLOW, a highly enjoyable comedy-drama set behind the scenes of an eighties female wrestling show. Netflix wanted the series delivered in high dynamic range (HDR) and wide colour gamut (WCG), of a spec so high that no screens are yet capable of displaying it. In fact Vertovec graded in P3 (the colour space used for cinema projection) which was then mapped to Netflix’s higher specs for delivery. The Rec.709 (standard gamut) version was automatically created from the P3 grade by Dolby Vision software which analysed the episodes frame by frame. Netflix streams a 4,000 NIT signal to all viewers, which is then down-converted live (using XML data also generated by the Dolby Vision software) to 100, 650 or 1,000 NITs depending on their display. In theory this should provide a consistent image across all screens.

Vertovec demonstrated his image pipeline for GLOW: multi-layer base grade, halation pass, custom film LUT, blur/sharp pass, grain pass. The aim was to get the look of telecined film. The halation pass involved making a copy of the image, keying out all but the highlights, blurring those highlights and layering them back on top of the original footage. I used to do a similar thing to soften Mini-DV footage back in the day!

An interesting point was made about practicals in HDR. If you have an actor in front of or close to a practical lamp in frame, it’s a delicate balancing act to get them bright enough to look real, yet not so bright that it hurts your eyes to look at the actor with a dazzling lamp next to them. When practicals are further away from your cast they can be brighter because your eye will naturally track around them as in real life.

Next up was Dan Duran from Red, who explained a new LUT that is being rolled out across their cameras. Most of this went in one ear and out the other!

 

“Breaking Bad”

Afterwards, Chris and I returned to Kung Fusion for another delicious dinner. The final event of the day which I wanted to catch was Breaking Bad‘s pilot episode, screening at Bydgoszcz’s Vue multiplex as part of the festival’s John Toll retrospective. Having binged the entire series relatively recently, I loved seeing the very first episode again – especially on the big screen – with the fore-knowledge of where the characters would end up.

Later Chris introduced me to DP Sebastian Cort, and the three of us decided to try our luck at getting into the Panavision party. We snuck around the back of the venue and into one of the peripheral buildings, only to be immediately collared by a bouncer and sent packing!

This ignoble failure marked the end of my Camerimage experience, more or less. After another drink or two at Cheat we called it a night, and I was on an early flight back to Stansted the next morning. I met some interesting people and learnt a lot from the seminars. There were some complaints that the festival was over-subscribed, and indeed – as I have described – you had to be quick off the mark to get into certain events, but that was pretty much what I had been expecting. I certainly won’t put be off attending again in the future.

To learn more about two of the key issues raised at this year’s Camerimage, check out my Red Shark articles:

Camerimage 2017: Wednesday