A CCD or a CMOS sensor alone is not able to detect the color of incident light. In reality, each pixel array cavity simply detects the intensity of the incident light as long the exposure is active. It cannot distinguish how much they have of each particular color of light. But when a color pattern filter is applied to the sensor, each pixel becomes sensitive to only one color - red, green or blue. As the human eye is more sensitive to green light than both red and blue light it has positive effects the array has twice as many green as red or blue sensors. The following images shows the color distribution and arrangement of a “Bayer Pattern” filter on a sensor with a size of x * y (with x and y being multiples of 2).
Since the arrangement of the colors in the Bayer pattern filter is known, an application can use the transmitted raw pixel information to interpolate full RGB color information for each pixel in the camera sensor. Instead of transmitting the raw pixel information, it is also common to use a color coding group known as YUV. The block diagram below illustrates the process of conversion inside a Basler color camera that is capable of this feature. To keep things simple, we assume that the sensor collects pixel data at an 8 bit depth.
As a first step, an algorithm calculates the complete RGB values for each and every pixel. This means, for example, that even if a pixel is sensitive to green light only, the camera gets full RGB information for the pixel by interpolating the intensity information out of adjacent red and blue pixels. This is, of course, just an approximation of the real world. There are many algorithms for doing RGB interpretation and the complexity and calculation time of each algorithm will determine the quality of the approximation. Basler color cameras have an effective built-in algorithm for this RGB conversion.
A disadvantage of RGB conversion is that the amount of data for each pixel is inflated. If a single pixel normally has a depth of 8 bits, after conversion it will have a depth of 8 bits per color (red, green and blue) and will thus have a total depth of 24 bits.
YUV coding converts the RGB signal to an intensity component (Y) that ranges from black to white plus two other components (U and V) which code the color. The conversion from RGB to YUV is linear, occurs without loss of information and does not depend on a particular piece of hardware such as the camera. The standard equations for accomplishing the conversion from RGB to YUV are:
Y = 0.299 R + 0.587 G + 0.114 B
U = 0.493 * (B - Y)
V = 0.877 * (R - Y)
In practice, the coefficients in the equations may deviate a bit due to the dynamics of the sensor used in a particular camera. If you want to know how the RGB to YUV conversion is accomplished in a particular Basler color camera, please refer to the camera’s user manual for the correct coefficients. This information is particularly useful if you want to convert the output from a Basler color camera from YUV back to RGB.
The diagram below illustrates how color can be coded with the U and V components and how the Y component codes the intensity of the signal.
This type of conversion is also known as YUV 4:4:4 sampling. With YUV 4:4:4, each pixel gets brightness and color information and the “4:4:4” indicates the proportion of the Y, U and V components in the signal.
To reduce the average amount of data transmitted per pixel from 24 bits to 16 bits, it is more common to include the color information for only every other pixel. This type of sampling is also known as YUV 4:2:2 sampling. Since the human eye is much more sensitive to intensity than it is to color, this reduction is almost invisible even though the conversion represents a real loss of information. YUV 4:2:2 digital output from a Basler color camera has a depth that alternates between 24 bits per pixel and 8 bits per pixel (for an average bit depth of 16 bits per pixel).
As shown in the table below, when a Basler camera is set for YUV 4:2:2 output, each quadlet of image data transmitted by the camera will contain data for two pixels. K represents the number of a pixel in a frame and one row in the table represents a quadlet of data transmitted by the camera.
For every other pixel, both the intensity information and the color information are transmitted and this results in a 24 bit depth for those pixels. For the remaining pixels, only the intensity information is preserved and this results in an 8 bit depth for them. As you can see, the average depth per pixel is 16 bits.
On all Basler color cameras, you are free to choose between an output mode that provides the raw sensor output for each pixel or a high quality YUV 4:2:2 signal. Some cameras provide RGB / BGR data as well.