A lookup table (LUT) is an array of numbers that may be referred to by subscript. This avoids the recalculation of values each time a number in a table must be referred to. In computer graphics, lookup tables are used to store the starting addresses of each line and the values corresponding to the placement of pixels within a byte.
Professionals working with video- or image-editing software may need to calibrate their monitors to achieve the high level of color accuracy their work requires. Often, the graphics card needs to have more than one LUT in order to calibrate more than one monitor at a time.
As color calibration hardware and software become more mainstream, we recognize the importance of having this information available and easy to find. Intel is currently working on making lookup table information available.
In feature-film visual-effects production, maintaining interactive feedback of high-quality color operations is extraordinarily beneficial to an artist. On the consumer side, enabling the real-time color correction of video and rendered image streams is becoming an increasingly useful tool to shape media's thematic "look." However, directly applying multiple sophisticated color transforms to high-resolution imagery is beyond the real-time capability of modern graphics hardware.
In this chapter, we present an algorithm that leverages three-dimensional lookup tables to enable the real-time color processing of high-resolution imagery. Our approach offers excellent performance characteristics, being independent of both the number of color operators applied as well as the underlying color transform complexity. The techniques presented in this chapter, and variations thereof, have been successfully utilized in the production of numerous motion pictures and should be regarded as "production ready."
Lookup tables (LUTs) are an excellent technique for optimizing the evaluation of functions that are expensive to compute and inexpensive to cache. By precomputing the evaluation of a function over a domain of common inputs, expensive runtime operations can be replaced with inexpensive table lookups. If the table lookups can be performed faster than computing the results from scratch (or if the function is repeatedly queried at the same input), then the use of a lookup table will yield significant performance gains.  For data requests that fall between the table's samples, an interpolation algorithm can generate reasonable approximations by averaging nearby samples.
A lookup table is characterized by its dimensionality, that is, the number of indices necessary to index an output value. The simplest LUTs are indexed by a single variable and thus referred to as one-dimensional (or 1D) LUTs.
Consider an analytical color operator, f(x), applied to an 8-bit grayscale image. The naive implementation would be to step through the image and for each pixel to evaluate the function. However, one may observe that no matter how complex the function, it can evaluate to only one of 255 output values (corresponding to each unique input). Thus, an alternate implementation would be to tabulate the function's result for each possible input value, then to transform each pixel at runtime by looking up the stored solution. Assuming that integer table lookups are efficient (they are), and that the rasterized image has more than 255 total pixels (it likely does), using a LUT will lead to a significant speedup.
Three-dimensional lookup tables offer the obvious solution to the inherent limitation of single-dimensional LUTs, allowing tabular data indexed on three independent parameters, as shown in Figure 24-3.
Interpolation algorithms allow lookup tables to generate results when queried for values between sample points. The simplest method, nearest-neighbor interpolation, is to find and return the nearest table entry. Although this method is fast (requiring only a single lookup), it typically yields discontinuous results and is thus rarely utilized in image processing.
Linear interpolation is adapted to 3D data sets by successively applying 1D linear interpolation along each of the three axes (hence the designation trilinear interpolation). By generating intermediate results based on a weighted average of the eight corners of the bounding cube, this algorithm is typically sufficient for color processing, and it is commonly implemented in graphics hardware. Higher-order interpolation functions use progressively more samples in the reconstruction function, though at significantly higher computation costs. (Straightforward cubic interpolation requires 33 = 27 texture lookups, but see Chapter 20 of this book, "Fast Third-Order Texture Filtering," for a technique that reduces this to 8 lookups.)
At runtime, we transform our input image by sampling the color in the normal fashion and then by performing a dependent texture lookup into the 3D texture (using the result of the 2D texture lookup as the 3D texture's input indices). The output of the 3D texture is our final, color-transformed result.
The fragment shader code, shown in Listing 24-1, is almost as simple as you would expect. The efficiency of this approach is readily apparent, as the entire color-correction process is reduced to a single (3D) texture lookup.
Observe that as the size of the LUT grows large, the correction factor becomes negligible. For example, with a 4,096-entry lookup table (a common 1D LUT size), the scaling factor can be safely ignored, as the error is smaller than the precision of the half pixel format. However, for the small lattice sizes common to 3D LUTs, the effect is visually significant and cannot be ignored; uncorrected 8-bit scaling errors on a 32x32x32 LUT are equivalent to clamping all data outside of [4..251]!
This code has obvious acceleration opportunities. First, as the scale and offset factors are constant across the image, they can be computed once and passed in as constants. (Or even better, they can be directly compiled into the fragment shader.) Second, it is good practice to use the smallest LUT that meets your needs (but no smaller). For "primary grading" color corrections, where one only modifies the color of the primaries (RGB) and secondaries (CMY), a 2x2x2 table will suffice! Finally, for consumer-grade applications, an integer texture format may suffice (of course, still with trilinear interpolation enabled). But be aware that 8 bits is not sufficient to prevent banding in LUT color transforms (Blinn 1998).
We cannot create a tabular form of a color transform if we do not first define the bounds of the table. We thus must choose minimum and maximum values to represent in the lookup table. The subtlety is that if we define a maximum that is too low, then we will unnecessarily clamp our data. However, if we define a maximum that is too high, then we will needlessly throw away table precision. (Which means our LUT sampling will be insufficient to re-create all but the smoothest of color transforms.)
We thus want to place our samples in the most visually significant locations, which typically occur closer to the dark end of the gamut. We achieve this effect by wrapping our 3D lookup-table transform with a matched pair of 1D "shaper" LUTs, as shown in Figure 24-7. The ideal shaper LUT maps the input HDR color space to a normalized, perceptually uniform color space.  The 3D LUT is then applied normally (though during its computation, this 1D transform must be accounted for). Finally, the image is mapped through the inverse of the original 1D LUT, "unwrapping" the pixel data back into their original dynamic range.
This document describes PNG (Portable Network Graphics), an extensible file format for the lossless, portable, well-compressed storage of raster images. PNG provides a patent-free replacement for GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel. Sample depths range from 1 to 16 bits.
This International Standard specifies a datastream and anassociated file format, Portable Network Graphics (PNG,pronounced "ping"), for a lossless, portable, compressedindividual computer graphics image transmitted across theInternet. Indexed-colour, greyscale, and truecolour images aresupported, with optional transparency. Sample depths range from 1to 16 bits. PNG is fully streamable with a progressive displayoption. It is robust, providing both full file integrity checkingand simple detection of common transmission errors. PNG can storegamma and chromaticity data as well as a full ICC colour profilefor accurate colour matching on heterogenous platforms. ThisStandard defines the Internet Media type "image/png". Thedatastream and associated file format have value outside of themain design goal.
The transform to be applied depends on the nature of the imagesamples and their precision. If the samples represent lightintensity in floating-point or high precision integer form(perhaps from a computer graphics renderer), the encoder mayperform "gamma encoding" (applying a power function with exponentless than 1) before quantizing the data to integer values forinclusion in the PNG datastream. This results in fewer bandingartifacts at a given sample depth, or allows smaller sampleswhile retaining the same visual quality. An intensity levelexpressed as a floating-point value in the range 0 to 1 can beconverted to a datastream image sample by:
The quality of rendering can be improved substantially byusing a palette chosen specifically for the image, since a colourcube usually has numerous entries that are unused in anyparticular image. This approach requires more work, first inchoosing the palette, and second in mapping individual pixels tothe closest available colour. PNG allows the encoder to supplysuggested palettes, but not all encoders will do so, and thesuggested palettes may be unsuitable in any case (they may havetoo many or too few colours). Therefore, high-quality viewerswill need to have a palette selection routine at hand. A largelookup table is usually the most feasible way of mappingindividual pixels to palette entries with adequate speed. 2b1af7f3a8