Image Processing with gluas

introduction to pixel molding

Øyvind Kolås


Table of Contents

Preface
1. Digital image representation
Vector images
Coordinate system
Defining shapes
Bitmap images
Sampling
Image compression
File formats and applications
2. The gluas environment
Installing gluas
Windows
In Linux, from source
Using gluas
The lua language
Setting and getting pixel values
Shadow buffer
Samples
A simple program
Gluas reference
Functions
Variables
Gluas compatible software
3. Visual Debugging Aids
Histogram
Chromaticity Diagram
Luminance Profile
4. Point operations
Threshold
Brightness
Contrast
Brightness + Contrast
Invert
Gamma
Levels
Curves
Lookup tables
Excercises
1. Exchange Red and Green
2. Value boost
3. Checkerboard
4. Position dependent contrast
5. Area operations
Sampling area
Edge handling
Convolution Matrix
Smoothing filters
Box blur
Gaussian blur
Speeding up blur by decomposing
Edge preserving
Rank/statistical filters
Median
Unsharp mask
Misc area filters.
Jitter
Excercises
1. Extend the convolve example
2. Adaptive blur
3. Circle
6. Resampling
Affine transformations
Translation
Rotation
Scaling
Other transformations
Perspective transform
Lens correction
Interpolation
Bilinear interpolation
Bicubic interpolation
Decimation
Trilinear interpolation
Excercises
1. Interpolation error visualization
2. Study the assignment
7. Automatic color adjustments
Contrast stretching
Automatic white balance
Grayworld assumption
Component stretching
Automatic spatial algorithms
Local Contrast Enhancement
Excercises
1. Problems with the Local Contrast Enhancement filter
8. Countering artifacts
Static sensor noise
Film grain
CCD sampling noise
Quantification
Interlacing
Line doubling
Linear blend
Temporal resampling
Aliasing
Color casts
A. Assignment, spring 2007
Introduction
Goal
Report
Dataset
Hints

List of Figures

1.1. Coordinate system.
1.2. Vector image
1.3. Raster image
1.4. Sampling grid
1.5. Sample depth
1.6. RGB bands
1.7. Indexed image
1.8. Run-length encoding
1.9. JPEG generation loss
2.1. Screenshot of gluas
2.2. basic gluas program
3.1. histogram
3.2. chromaticity diagram
3.3. luminance profile
4.1. threshold
4.2. brightness
4.3. brightness subtraction
4.4. contrast
4.5. brightness and contrast
4.6. invert
4.7. gamma
4.8. The GIMP's levels tool
4.9. levels
4.10. The GIMP's curves tool
5.1. Sliding window
5.2. Over the edge
5.3. convolve
5.4. blur
5.5. kuwahara
5.6. snn
5.7. median
5.8. jitter
6.1. translation
6.2. rotation
6.3. scaling
6.4. bilinear
6.5. bicubic
6.6. box decimate
7.1. contrast stretching
7.2. grayworld assumption
7.3. component stretching
7.4. lce

List of Tables

1.1. Common/encountered raster dimensions
1.2. Vector File Formats
1.3. Raster File Formats

Colophon

Recursivly look for new association when association fails during naming of a method or class.

--Anonymous Coder

This text was written according to the DocBook XML DTD. Both manually using vim, and through custom ruby scripts.

[Tip]Printing

If you need to print the entire document, use the single xhtml output image_processing.html. Firefox is known to provide a nice print-out using the CSS applied.

xsltproc, db2latex pdftex and latex-ucs are used to generate the PDF version , while xsltproc and Norman Walsh's XSL stylesheets are used for the XHTML version. (validate)

The typographical layout is done with a CSS stylesheet using the XHTML translation of DocBook as a microformat.

[Caution]Caution

The CSS2 stylesheet works with Internet Explorer 7.0 or more modern browsers (FireFox, Safari, Konqueror and Opera).

For figures showing image processing filters with accompanying source code the output shown is created from the exact code used by the figure making sure the output is not out of sync.

The illustrations have been made using gluas, the GNU Image Manipulation Program,, cairo/rcairo and ImageMagick's convert.

This text has been written in various places, Gjøvik, Oslo, Berlin and Brussels.

[Note]Note to Developers/Writers

This document can be seen as a collaborative effort, patches are welcome. New algorithms and/or a port to nickle would be most welcome.

Preface

The rays, to speak properly, are not colored; in them there is nothing else than a certain power and disposition to stir up a sensation of this or that color.

--Sir Isaac Newton

This text tries to explain the craft of digital alteration and conjuring of images to your own liking. It attempts doing so from a programmers mind-set rather than a mathematical one.

The text presumes that the reader is able to understand code to a certain extent, and that the combination of the code showing the actual logical steps involved combined with a basic introduction to the same procedure in english will provide an understanding of what is happening to the image data and why the algorithms yield the output images they do.

Chapter 1. Digital image representation

Virtual image, a point or system of points, on one side of a mirror or lens, which, if it existed, would emit the system of rays which actually exists on the other side of the mirror or lens.

--Clerk Maxwell

Vector images

One way to describe an image using numbers is to declare its contents using position and size of geometric forms and shapes like lines, curves, rectangles and circles; such images are called vector images.

Coordinate system

We need a coordinate system to describe an image, the coordinate system used to place elements in relation to each other is called user space, since this is the coordinates the user uses to define elements and position them in relation to each other.

Figure 1.1. Coordinate system.

Coordinate system.

The coordinate system used for all examples in this document has the origin in the upper left, with the x axis extending to the right and y axis extending downwards.

Defining shapes

It would have been nice to make a smiling face, instead of the dissatisfied face on the left, by using a bezier curve, or the segment of a circle this could be achieved, this being a text focusing mainly on raster graphics though, that would probably be too complex.

A simple image of a face can be declared as follows:

Figure 1.2. Vector image

draw circle
     center        0.5, 0.5
     radius        0.4
     fill-color    yellow
     stroke-color  black
     stroke-width  0.05
draw circle
     center        0.35, 0.4
     radius        0.05
     fill-color    black
draw circle
     center        0.65, 0.4
     radius        0.05
     fill-color    black
draw line
     start         0.3, 0.6
     end           0.7, 0.6
     stroke-color  black
     stroke-width  0.1
         
Vector image
A vector image of a face, and the instructions used to create the image.

The preceding description of an image can be seen as a “cooking recipe” for how to draw the image, it contains geometrical primitives like lines, curves and cirles describing color as well as relative size, position and shape of elements. When preparing the image for display is has to be translated into a bitmap image, this process is called rasterization.

A vector image is resolution independent, this means that you can enlarge or shrink the image without affecting the output quality. Vector images are the preferred way to represent Fonts, Logos and many illustrations.

Bitmap images

Bitmap-, or raster [1] -, images are “digital photographs”, they are the most common form to represent natural images and other forms of graphics that are rich in detail. Bitmap images is how graphics is stored in the video memory of a computer. The term bitmap refers to how a given pattern of bits in a pixel maps to a specific color.

[Note]Note

In the other chapters of introduction to image molding, raster images is the only topic.

Figure 1.3.  Raster image

Raster image
A rasterized form of the letter 'a' magnified 16 times using pixel doubling

A bitmap images take the form of an array, where the value of each element, called a pixel picture element, correspond to the color of that portion of the image. Each horizontal line in the image is called a scan line.

The letter 'a' might be represented in a 12x14 matrix as depicted in Figure 3., the values in the matrix depict the brightness of the pixels (picture elements). Larger values correspond to brighter areas whilst lower values are darker.

Sampling

When measuring the value for a pixel, one takes the average color of an area around the location of the pixel. A simplistic model is sampling a square, this is called a box filter, a more physically accurate measurement is to calculate a weighted Gaussian average (giving the value exactly at the pixel coordinates a high weight, and lower weight to the area around it). When perceiving a bitmap image the human eye should blend the pixel values together, recreating an illusion of the continuous image it represents.

Raster dimensions

The number of horizontal and vertical samples in the pixel grid is called Raster dimensions, it is specified as width x height.

Resolution

Resolution is a measurement of sampling density, resolution of bitmap images give a relationship between pixel dimensions and physical dimensions. The most often used measurement is ppi, pixels per inch [2].

Figure 1.4. Sampling grid

Sampling grid
A rasterized form of the letter 'a' magnified 16 times, where each pixel is represented as a circle instead of a square.

Megapixels

Megapixels refer to the total number of pixels in the captured image, an easier metric is raster dimensions which represent the number of horizontal and vertical samples in the sampling grid. An image with a 4:3 aspect ratio with dimension 2048x1536 pixels, contain a total of 2048x1535=3,145,728 pixels; approximately 3 million, thus it is a 3 megapixel image.

Table 1.1. Common/encountered raster dimensions

DimensionsMegapixelsNameComment
640x4800.3VGAVGA
720x5760.4CCIR 601 DV PALDimensions used for PAL DV, and PAL DVDs
768x5760.4CCIR 601 PAL fullPAL with square sampling grid ratio
800x6000.4SVGA 
1024x7680.8XGAThe currently (2004) most common computer screen dimensions.
1280x9601.2  
1600x12002.1UXGA 
1920x10802.11080i HDTVinterlaced, high resolution digital TV format.
2048x15363.12KTypically used for digital effects in feature films.
3008x19605.3  
3088x20566.3  
4064x270411.1  

Scaling / Resampling

When we need to create an image with different dimensions from what we have we scale the image. A different name for scaling is resampling, when resampling algorithms try to reconstruct the original continous image and create a new sample grid.

Scaling image down

The process of reducing the raster dimensions is called decimation, this can be done by averaging the values of source pixels contributing to each output pixel.

Scaling image up

When we increase the image size we actually want to create sample points between the original sample points in the original raster, this is done by interpolation the values in the sample grid, effectivly guessing the values of the unknown pixels[3].

Sample depth

The values of the pixels need to be stored in the computers memory, this means that in the end the data ultimately need to end up in a binary representation, the spatial continuity of the image is approximated by the spacing of the samples in the sample grid. The values we can represent for each pixel is determined by the sample format chosen.

Figure 1.5. Sample depth

Sample depth
The same image width varying sample depths, note that high frequency areas (detailed areas) have an OK look earlier than low frequency areas.

8bit

A common sample format is 8bit integers, 8bit integers can only represent 256 discrete values (2^8 = 256), thus brightness levels are quantized into these levels.

12bit

For high dynamic range images (images with detail both in shadows and highlights) 8bits 256 discrete values does not provide enough precision to store an accurate image. Some digital cameras operate with more than 8bit samples internally, higher end cameras (mostly SLRs) also provide RAW images that often are 12bit (2^12bit = 4096).

16bit

The PNG and TIF image formats supports 16bit samples, many image processing and manipulation programs perform their operations in 16bit when working on 8bit images to avoid quality loss in processing.

Floating point

Some image formats used in research and by the movie industry store floating point values. Both "normal" 32bit floating point values and a special format called half which uses 16bits/sample. Floating point is useful as a working format because quantization and computational errors are kept to a minimum until the final render.

Floating point representations often include HDR, High Dynamic Range. High Dynamic Range images are images that include sampling values that are whiter than white (higher values than 255 for a normal 8bit image). HDR allows representing the light in a scene with a greater degree of precision than LDR, Low Dynamic Range images.

Colors

The most common way to model color in Computer Graphics is the RGB color model, this corresponds to the way both CRT monitors and LCD screens/projectors reproduce color. Each pixel is represented by three values, the amount of red, green and blue. Thus an RGB color image will use three times as much memory as a gray-scle image of the same pixel dimensions.

Figure 1.6. RGB bands

RGB bands
Color image built up of bands of red, green and blue color (This is image illustrates how a laptop display is constructed, note that this image is preferred to be viewed on a computer screen, and not in print.)

One of the most common pixel formats used is 8bit rgb where the red, green and blue values are stored interleaved in memory. This memory layout is often referred to as chunky, storing the components in seperate buffers is called planar, and is not as common.

Palette / Indexed images

It was earlier common to store images in a palletized mode, this works similar to a paint by numbers strategy. We store just the number of the palette entry used for each pixel. And for each palette entry we store the amount of red, green and blue light.

Figure 1.7. Indexed image

Indexed image
On the left, image using just 16 colors, on the right the palette used for this image. The way an indexed/paletted image works is similar to how paint by numbers work.

Image compression

Bitmap images take up a lot of memory, image compression reduces the amount of memory needed to store an image. For instance a 2.1 megapixel, 8bit RGB image (1600x1200) occupies 1600x1200x3 bytes = 5760000 bytes = 5.5 megabytes, this is the uncompressed size of the image.

Compression ratio is the ratio between the compressed image and the uncompressed image, if the example image mentioned above was stored as a 512kb jpeg file the compression ratio would be 0.5mb : 5.5mb = 1:11.

Lossless Image Compression

When an image is losslessly compressed, repetition and predictability is used to represent all the information using less memory. The original image can be restored. One of the simplest lossless image compression methods is run-length encoding. Run-length encoding encodes consecutive similar values as one token in a data stream.

Figure 1.8. Run-length encoding

Run-length encoding
70,
5, 25,
5, 27,
4, 26,
4, 25,
6, 24,
6, 23,
3, 2, 3, 22,
3, 2, 3, 21,
3, 5, 2, 20,
3, 5, 2, 19,
3, 7, 2, 18,
3, 7, 2, 17,
14, 16,
14, 15,
3, 11, 2, 14,
3, 11, 2, 13,
3, 13, 2, 12,
3, 13, 2, 11,
3, 15, 2, 10,
3, 15, 2, 8,
6, 12, 6, 6,
6, 12, 6, 64
           

In Figure 1.8, “Run-length encoding” a black and white image of a house has been compressed with run length encoding, the bitmap is considered as one long string of black/or white pixels, the encoding is how many bytes of the same color occur after each other. We'll further reduce the amount of bytes taken up by these 72 numerical values by having a maximum span length of 15, and encoding longer spans by using multiple spans separated by zero length spans of the other color.

70,                     15,  0, 15,  0, 15,  0, 10,
 5, 25,                  5, 15,  0, 10,
 5, 27,                  6, 15,  0, 12,
 4, 26,                  4, 15,  0, 11,
 4, 25,                  4, 15,  0, 10,
 6, 24,                  6, 15,  0,  9,
 6, 23,                  6, 15,  0,  8,
 3,  2, 3, 22,           3,  2,  3, 15,  0,  7,
 3,  2, 3, 21,           3,  2,  3, 15,  0,  6,
 3,  5, 2, 20,           3,  5,  2, 15,  0,  5,
 3,  5, 2, 19,           3,  5,  2, 15,  0,  4,
 3,  7, 2, 18,           3,  7,  2, 15,  0,  3,
 3,  7, 2, 17,           3,  7,  2, 15,  0,  2
14, 16,                 14, 15,  0,  1
14, 15,                 14, 15,
 3, 11, 2, 14,           3, 11,  2, 14,
 3, 11, 2, 13,           3, 11,  2, 13,
 3, 13, 2, 12,           3, 13,  2, 12,
 3, 13, 2, 11,           3, 13,  2, 11,
 3, 15, 2, 10,           3, 15,  2, 10,
 3, 15, 2,  8,           3, 15,  2,  8,
 6, 12, 6,  6,           6, 12,  6,  6,
 6, 12, 6, 64            6, 12,  6, 15, 0, 15, 0, 15, 0, 15, 0, 4
  

The new encoding is 113 nibbles long, a nibble i 4bit and can represent the value 0--4, thus we need 57 bytes to store all our values, which is less than the 93 bytes we would have needed to store the image as a 1bit image, and much less than the 750 bytes needed if we used a byte for each pixel. Run length encoding algorithms used in file formats would probably use additional means to compress the RLE stream achieved here.

Lossy Image Compression

Lossy image compression takes advantage of the human eyes ability to hide imperfection and the fact that some types of information are more important than others. Changes in luminance are for instance seen as more significant by a human observer than change in hue.

JPEG is a file format implementing compression based on the Discrete Cosine Transform DCT, together with lossless algorithms this provides good compression ratios. The way JPEG works is best suited for images with continuous tonal ranges like photographs, logos, scanned text and other images with lot's of sharp contours / lines will get more compression artifacts than photographs.

Loss through Generations

Lossy compression algorithms should not be used as a working format, only final copies should be saved as jpeg since loss accumulates over generations.

Figure 1.9. JPEG generation loss

JPEG generation loss

An image specially constructed to show the deficiencies in the JPEG compression algorithm, saved, reopened and saved again 9 times.

JPEG is most suited for photographics content where the adverse effect of the compression algorithm is not so evident.

JPEG is not suited as an intermediate format, only use JPEG for final distribution where filesize actually matters.


File formats and applications

Many applications have their own internal file format, while other formats are more suited for interchange of data. Table ref# lists some common image formats.

Table 1.2. Vector File Formats

ExtensionNameNotes
.aiAdobe Illustrator DocumentNative format of Adobe Illustrator (based on .eps)
.epsEncapsulated PostscriptIndustry standard for including vector graphics in print
.psPostScriptVector based printing language, used by many Laser printers, used as electronic paper for scientific purposes.
.pdfPortable Document FormatModernized version of ps, adopted by the general public as 'electronic print version'
.svgScalable Vector GraphicsXML based W3C standard, incorporating animation, gaining adoption.
.swfShockwave FlashBinary vector format, with animation and sound, supported by most major web browsers.

Table 1.3. Raster File Formats

ExtensionNameNotes
.jpgJoint Photographic Experts GroupLossy compression format well suited for photographic images
.pngPortable Network GraphicsLossless compression image, supporting 16bit sample depth, and Alpha channel
.gifGraphics Interchange Format8bit indexed bitmap format, is superceded by PNG on all accounts but animation
.exrEXRHDR, High Dynamic Range format, used by movie industry..
.raw, .rawRaw image fileDirect memory dump from a digital camera, contains the direct imprint from the imaging sensor without processing with whitepoint and gamma corrections. Different cameras use different extensions, many of them derivatives of TIFF, examples are .nef, .raf and .crw
.dgnDigital NegativeA subset/clarification of TIFF, created by Adobe to provide a standard for storing RAW files, as well as exchanging RAW image data between applications.
.tiff, .tifTagged Image File Format 
.psdPhotoshop DocumentNative format of Adobe Photoshop, allows layers and other structural elements
.xcfGimp Project FileGIMP's native image format.



[1] raster n: formation consisting of the set of horizontal lines that is used to form an image on a CRT

[2] The difference between ppi and dpi, is the difference between pixels and dots - pixels can represent multiple values, whilst a dot is a monochrome spot of ink or toner of a single colorant as produced by a printer. Printers use a process called half toning to create a monochrome pattern the simulates a range of intensity levels.

[3] When using the digital zoom of a camera, the camera is using interpolating to guess the values that are not present in the image. Capturing an image at the maximum analog zoom level, and doing the post processing of cropping and rescaling on the computer will give equal or better results.

Chapter 2. The gluas environment

Gluas is a gimp plug-in providing a enviroment for testing algorithms for image processing. The environment contains a simple editor for entering the algorithms. It uses the lua interpreter.

Gluas can be downloaded from Gluas download page, to be able to use gluas properly you also need an installation of The GIMP.

Figure 2.1. Screenshot of gluas

Screenshot of gluas

Installing gluas

Windows

[Note]Note

Note that the author doesn't know the windows installation procedure in detail. Any contribution to clarify would be welcome.

System wide

The gluas installer on the website provides for a system wide installation of gluas, this is the easiest way to install gluas,

[Caution]Caution

If The GIMP has been installed in a location where the user does not have write privileges, you need to do a per user install.

Per user

Placing the gluas binary (not the installation package) in your users plug-in folder (the location of this folder was announced upon first running The GIMP.

In Linux, from source

Procedure 2.1. Installing gluas from source

  1. Preqrequisites

    You need The GIMP's developer files (headers), as well as an installed lua runtime, with development files (headers). In Debian™ the following packages, and their dependencies need to be installed:

    • liblua50
    • liblua50-dev
    • liblualib50
    • liblualib50-dev
    • gimp
    • libgimp2.0
    • libgimp2.0-dev

    For syntax highlighting libgtksourceview must be installed as well.

  2. Download

    The latest release of the gluas source can be found in the files subfolder of the gluas website. The latest version is the version with the highest version number, at the time of writing that was 0.1.19.

    At the commandline, entering:

    wget http://pippin.gimp.org/plug-ins/gluas/files/gluas-0.1.19.tar.gz

    should be sufficient to complete this step, although checking for a never version using a web browser is recommended.

  3. Extract

    tar xvzf gluas-0.1.19.tar.gz

    The characters after the command tar indicates what the tar applications should do, the meaning of the used characters are:

    x

    extract

    v

    verbose

    z

    ungzip (uncompress, replace this with j, for .tar.bz2 files)

    f

    the input to tar is a file (as opposed to data coming on standard input)

  4. Enter the gluas directory

    cd gluas-0.1.19
  5. Configure

    ./configure

    This will prepare compilation of the gluas plug-in for your specific system, if any components/packages that are needed are not found, the configure script will complain.

    [Tip]Tip

    Running configure takes a little while, prepare a cup of tea while it runs.

  6. Make

    make

    Compiling gluas should not take a long time, the binary produced can either be manually placed in the users private plug-in folder, or be installed systemwide.

    [Tip]Relax

    If you are following the previous tip, you can now drink your tea.

  7. Install

    We need to be the super user to install the application, thus we need to become root.

    su
    Password: joshua
    [Caution]Caution

    If you already are root, ask someone who knows why this is stupid.

    Now we issue the command to do the actual installation of our compiled program.

    make install

    The next time you start the Gimp, gluas should be available from the menu system at: FiltersGenericgluas.

Using gluas

The enviroment set up when processing starts, sets the global variables width and height, functions for setting and getting pixels are provided

The lua language

A programming tutorial for lua is out of scope for this document, the samples given elsewhere in this document should provide a feel for the language, some peculiarities of lua is explained here, look at The lua website, for a short overview when reading sample code Lua Short Reference might come in handy.

Multiple value return

Lua supports returning multiple values from a function, this is useful when dealing with color components. E.g a function can easily return a RGB or LAB triplet in one go.

Setting and getting pixel values

There are functions to set and get individual pixels, they are of the form:

v get_value( x,
  y);
set_value( x,
  y,
  v);

See the section called “Gluas reference” for more functions, and details.

All values used in gluas are floating point values, unless mandated by the color model, the normal range is from 0.0 to 1.0. This range is used instead of an more arbitary range like 0-255, 0-42, 0-100 or 0-65535.

Image dimensions

The width and height of the image in pixels can be accessed through global variables with those names.

Selection

The GIMP takes care of modifying only the portions that are part of the current selection, as well as blending old and new pixel values for partially selected pixels.

The coordinates of the boundingbox of the selection can be accessed through global variables, see bound_x0, bound_y0 and bound_x1, bound_y1.

Shadow buffer

When pixels are set, they don't immediatly overwrite the original pixel values. This is to facilitate area operations. When multipass algorithms are implemented, like a two pass blur, the changes can be comitted to the source buffer by issuing

Samples

Refer to Images Processing using gluas (which this gluas introduciton is a part of) for examples of how the gluas environment can be used.

A simple program

In the following sample the code needed for a simple thresholding filter is documented.

Figure 2.2. basic gluas program

basic gluas program
for y=0, height-1 do
  for x=0, width-1 do
    v = get_value (x,y)

    if v < 0.5 then
       v=1.0
    else
       v=0.0
    end

    set_value (x,y,v)
  end
  progress (y/height)
end

By either pressing the button with gears, or pressing F5 gluas will apply the algorihm described. If any processing errors occur an error panel will appear at the top of the gluas window.

Gluas reference

Functions

Gluas uses the core lua language for processing, but extends it with some functions specific to interaction with The GIMP.

get_rgba, set_rgba

r,g,b,a get_rgba( x,
  y);

Returns the Red, Green, Blue and Alpha components for the coordinates specified by the x,y parameters. The values are taken from the source image which is the original pixel data, unless flush() has been invoked.

RGB values within gluas are values between 0.0 and 1.0.

set_rgba( x,
  y,
  r,
  g,
  b,
  a);

Sets a pixel, using the specified Red, Green, Blue and Alpha component values.

get_rgb, set_rgb

r,g,b get_rgb( x,
  y);
set_rgb( x,
  y,
  r,
  g,
  b);

Functions similar to the RGBA varieties, note that the set_rgb function will not preserve the alpha value of the pixel, but boost it to 1.0 (full opacity).

get_value, set_value

v get_value( x,
  y);
set_value( x,
  y,
  v);

Setters and getters [4] for grayscale values.

get_hsl, set_hsl

h,s,l get_hsl( x,
  y);
set_hsl( x,
  y,
  h,
  s,
  l);

Hue, Saturation, Lightness pixel value setters and getters.

get_hsv, set_hsv

h,s,v get_hsv( x,
  y);
set_hsv( x,
  y,
  h,
  s,
  v);

Hue, Saturation, Value pixel value setters and getters. Value is calculated by gimp as MAX(MAX(r,g),b), i.e. the R,G or B component with the largest value.

get_lab, set_lab

l,a,b get_lab( x,
  y);
set_lab( x,
  y,
  l,
  a,
  b);

CIE Lab pixel setter and getter functions. .

progress

progress( percentage);

Update the progress bar in The GIMP's statusbar to the given percentage, useful when giving user feedback.

flush

flush( );

Copies the data from the temporary output image buffer, over the data in the source image buffer. This makes it possible to do additional processing steps (the values returned with getters will be the values previously set with setters prior to the flush() ).

flush() is mostly useful when implementing multipass algorithms, like a decomposed blur.

Variables

In order to know the dimensions, and some other aspects of the image, some numeric constants are also imported into the lua enviroment you program in.

width

The width of the input image, in pixels.

height

The height of the input image, in pixels.

user_data

When expanding the properties pane in the gluas editor, a slider shows up, the value of this slider can be accessed through this global variable.

Note that the program is run each time the mouse button is released on a new value.

It is also possible to generate animations that can be played back with The GIMP's Animation plug-ins. Then the value of user_data varies from 0.0 to 1.0.

bound_x0, bound_y0

The upper left coordinate of the bounding box of the current selection, may be used to speed up calculations by not taking unselected areas into account.

bound_x1, bound_y1

The bottom right coordinate of the bounding box of the current selection, may be used to speed up calculations by not taking unselected areas into account.

Gluas compatible software

Marco Pontello has developed a Project Dogwaffle plug-in, DogLua, , which is compatible with the core gluas functionality.

Artweaver (a freeware image painting program) also have support for gluas script. Artweaver lua scripting interface plug-in.



[4] The grayscale value is calculated as (r+g+b)/3

Chapter 3. Visual Debugging Aids

When creating tools to modify images, it is often vital to have an model of the image data. Using alternate ways to visualize the data often reveals other properties and/or enhances our ability to observe features.

Histogram

A histogram shows the distribution of low/high (dark/light) samples within an image.

The bars in the left portion of the histogram shows the amount of dark/shadow samples in the image, in the middle the mid-tones are shown and to the right the highlights.

A low contrast image will have a histogram concentrated around the middle. A dark image will have most of the histogram to the right, a bright image uses mostly the right hand side of the histogram, while a high contrast image will have few mid-tone values; making the histogram have an U shape.

The histogram in Figure 3.1, “histogram” spans from left to right, so it makes good use of the available contrast. The image is overexposed beyond repair, this can be seen from the clipping at the right hand (highlight side). This means details of the clouds in the sky are permantently lost.

Figure 3.1. histogram

histogram
-- utility function to set the RGB color for a rectangular region
function draw_rectangle(x0,y0,x1,y1,r,g,b)
  for y=y0,y1 do
    for x=x0,x1 do
      set_rgb (x,y, r,g,b)
    end
  end 
end

function draw_histogram(bins)
  -- first we count the number of pixels belonging in each bin
  local bin = {}
  for i=0,bins do
    bin[i]=0
  end
  for y=0, height-1 do
    for x=0, width-1 do
      l,a,b = get_lab (x,y)
      bin[math.floor(bins * l/100)] = bin[math.floor(bins * l/100)] + 1
    end
    progress (y/height/2)
  end
  -- then we find the maximum number of samples in any bin
  max = 0
  for i=0,bins do
    if bin[i]>max then
      max=bin[i]
    end
  end
  -- and then plot the histogram over the original image.
  for i=0,bins do
    draw_rectangle((width/(bins+1))*i, height-(bin[i]/max*height),
             (width/(bins+1))*(i+1), height, 1, 1, 0)
    progress (i/bins/2+0.5)
  end
end

draw_rectangle(0,0,width,height,0,0,0)
draw_histogram(64)

Chromaticity Diagram

A chromaticity diagram tells us about the color distribution and saturation in an image. All pixels are plotted inside a color circle, distance to center gives saturation and angle to center specifies hue. If the white balance of an image is wrong it can most often be seen by a non centered chromaticity diagram.

[Note]Origin in analog widget

A chromaticity scope in the analog television world were traditionally created by conncting an oscilloscope to the color components of a video stream. This allowed debugging setting of the equipment handling the analog image data.

Figure 3.2. chromaticity diagram

chromaticity diagram
function draw_rectangle(x0,y0,x1,y1,r,g,b)
  for y=y0,y1 do for x=x0,x1 do set_rgb (x,y, r,g,b) end end 
end

function draw_chromascope(scale)
  if width > height then
     scale = (scale * 100)/height
  else
     scale = (scale * 100)/width
  end
  for y=0, height-1 do
     for x=0, width-1 do
        l,a,b = get_lab (x,y)
        set_lab ( (a * scale) + width*0.5,
                  (b * scale) + height*0.5,
                   l,a,b)
     end
     progress (y/height)
  end
end

draw_rectangle(0,0,width,height,0,0,0)
draw_chromascope(4)
draw_rectangle(width*0.5,0,width*0.5,height,1,1,1)
draw_rectangle(0,height*0.5,width,height*0.5,1,1,1)

Luminance Profile

A luminiosity profile is the visualization of the luminance along a straight line in the image. The luminance plot allows a more direct way to inspect the values of an image than a histogram does.

In Figure 3.3, “luminance profile” the yellow curve indicates which line the profile is taken from, the brightness of the snow as well as the relative brightness of the sky compared to the mountains is quite apparent.

Figure 3.3. luminance profile

luminance profile
function draw_rectangle(x0,y0,x1,y1,r,g,b)
  for y=y0,y1 do for x=x0,x1 do set_rgb (x,y, r,g,b) end end 
end

function lerp(v0,v1,r)
  return v0*(1.0-r)+v1*r
end

function draw_line(x0,y0,x1,y1,r,g,b)
  for t=0,1.0,0.001 do
    x = lerp(x0,x1,t)
    y = lerp(y0,y1,t)
    set_rgb (x,y,r,g,b)
  end
end

function lumniosity_profile(x0,y0,x1,y1)
  for y=0,height-1 do
    for x=0,width-1 do
       l,a,b = get_lab (x,y)
       set_lab (x,y, l*0.3,a,b)
    end
  end
  draw_line(x0,y0,x1,y1,1,1,0)
  prev_x, prev_y = -5, height*0.5
  for x=0,width do
     xsrc = lerp(x0,x1, x/width)
     ysrc = lerp(y0,y1, x/width)
     l,a,b = get_lab (xsrc,ysrc)
     y = height - (l/100) * height
     draw_line (prev_x, prev_y, x,y, 1,1,1)
     prev_x, prev_y = x, y
  end
end

lumniosity_profile(0,height*0.2, width,height*0.5)

Chapter 4. Point operations

He told him, point for point, in short and plain.

--Chaucher

The simplest image filters are point operations, where the new value of a pixel are only determined by the original value of that single pixel alone.

Threshold

Thresholding an image is the process of making all pixels above a certain threshold level white, others black.

Figure 4.1. threshold

threshold
function threshold_transform(value, level)
  if value>level then
    return 1
  else
    return 0
  end
end

function threshold_get_value(x,y,level)
  value = get_value(x,y)
  return threshold_transform(value,level)
end

function threshold(level)
   for y=0, height-1 do
     for x=0, width-1 do
       set_value (x,y, threshold_get_value(x,y,level))
     end
     progress(y/height)
   end
   flush()
end

threshold(0.5)

[Note]Note

In Figure 4.1, “threshold” and other examples for point operations, there is more code than strictly neccesary to implement the effect. The additional structuring and modularization of the code is meant to make the actual transformation of the pixel values easier to see seperated from the loops and details of dealing with the pixel data.

Brightness

When changing the brightness of an image, a constant is added or subtracted from the luminnance of all sample values. This is equivalent to shifting the contents of the histogram left (subtraction) or right (addition).

new_value = old_value + brightness

The goal of the code in Figure 4.2, “brightness” is to add a constant amount of light to the sample value in each position in the image raster.

Figure 4.2. brightness

brightness
function add_transform(value, shift)
  return value+shift
end

function add_get_value(x,y,shift)
  value = get_value(x,y)
  return add_transform(value,shift)
end

function add(shift)
   for y=0, height-1 do
     for x=0, width-1 do
       set_value (x,y, add_get_value(x,y,shift))
     end
     progress(y/height)
   end
   flush()
end

add(0.25)

Figure 4.3. brightness subtraction

brightness subtraction
function add_transform(value, shift)
  return value+shift
end

function add_get_value(x,y,shift)
  value = get_value(x,y)
  return add_transform(value,shift)
end

function add(shift)
   for y=0, height-1 do
     for x=0, width-1 do
       set_value (x,y, add_get_value(x,y,shift))
     end
     progress(y/height)
   end
   flush()
end

function subtract(shift)
  add(-shift)
end

subtract(0.25)

Contrast

Changing the contrast of an image, changes the range of luminance values present. Visualized in the histogram it is equivalent to expanding or compressing the histogram around the midpoint value. Mathematically it is expressed as:

new_value = (old_value - 0.5) × contrast + 0.5

Figure 4.4. contrast

contrast
function mul_transform(value, factor)
  return (value-0.5)*factor + 0.5
end

function mul_get_value(x,y,factor)
  value = get_value(x,y)
  return mul_transform(value,factor)
end

function mul(factor)
   for y=0, height-1 do
     for x=0, width-1 do
       set_value (x,y, mul_get_value(x,y,factor))
     end
     progress(y/height)
   end
   flush()
end

mul(2.0)

The subtraction and addition of 0.5 is to center the expansion/compression of the range around 50% gray.

Specifying a value above 1.0 will increase the contrast by making bright samples brighter and dark samples darker thus expanding on the range used. While a value below 1.0 will do the opposite and reduce use a smaller range of sample values.

Brightness + Contrast

It is common to bundle brightness and control in a single operations, the mathematical formula then becomes:

new_value = (old_value - 0.5) × contrast + 0.5 + brightness

Figure 4.5. brightness and contrast

brightness and contrast
function transform(value, brightness, contrast)
  return (value-0.5)*contrast+0.5+brightness
end

function transform_rgb(r,g,b, brightness, contrast)
  return transform(r, brightness, contrast), transform(g, brightness, contrast), transform(b, brightness, contrast)
end

function bcontrast_get_rgb(x,y,brightness, contrast)
   r,g,b=get_rgb(x,y)
   return transform_rgb(r,g,b, brightness, contrast)
end

function bcontrast(brightness, contrast)
    for y=0, height-1 do
      for x=0, width-1 do
        set_rgb(x,y, bcontrast_get_rgb(x,y,brightness,contrast))
      end
    end
    flush ()
end

bcontrast(0.25, 2.0)

Invert

Inverting the sample values in the image, produces the same image that would be found in a film negative. Figure 4.6, “invert”

Figure 4.6. invert

invert
function invert_value(value)
  return 1.0-value
end

function invert_rgb(r,g,b)
  return invert_value(r), 
         invert_value(g),
         invert_value(b)
end

function invert_get_rgb(x,y)
  r,g,b=get_rgb(x,y)
  return invert_rgb(r,g,b)
end

function invert()
  for y=0, height-1 do
    for x=0, width-1 do
      set_rgb(x,y, invert_get_rgb(x,y))
    end
  end
  flush()
end

invert()

Gamma

A CRT monitor doesn't have a linear correspondence between the voltage sent to the electron guns and the brightness shown. The relationship is closely modelled by a powerfunction i.e. display_intensity=pixel_valuegamma. To correct an image for display, assuming the monitor doesn't already have global corrections in place. Will involve applying the function new_value=old_value1.0-gamma.

Figure 4.7. gamma

gamma
function gamma_transform(value, param)
  return value^param
end

function gamma_get_value(x,y,param)
  value = get_value(x,y)
  return gamma_transform(value,param)
end

function gamma(param)
   for y=0, height-1 do
     for x=0, width-1 do
       set_value (x,y, gamma_get_value(x,y,param))
     end
     progress(y/height)
   end
   flush()
end

gamma(1.5)

Levels

The levels tool found in many image processing packages is in it's simplest form just a different way to adjust brightness/ contrast.

Figure 4.8. The GIMP's levels tool

The GIMP's levels tool

The parameters found in levels tools are in the order of spatial placement in dialogs like Figure 4.8, “The GIMP's levels tool”:

  • input blacklevel
  • input whitelevel
  • output blacklevel
  • output whitelevel
  • gamma correction

Figure 4.9. levels

levels
function levels_value(value, in_min, gamma, in_max, out_min, out_max)
  -- normalize
  value = (value-in_min) / (in_max-in_min)
  -- transform gamma
  value = value^gamma
  -- rescale range
  return value * (out_max-out_min) + out_min
end

function levels_rgb(r,g,b, in_min, gamma, in_max, out_min, out_max)
  return levels_value(r, in_min, gamma, in_max, out_min, out_max),
         levels_value(g, in_min, gamma, in_max, out_min, out_max),
         levels_value(b, in_min, gamma, in_max, out_min, out_max)
end

function levels_get_rgb(x,y, in_min, gamma, in_max, out_min, out_max)
  r,g,b=get_rgb(x,y)
  return levels_rgb(r,g,b, in_min, gamma, in_max, out_min, out_max)
end

function levels(in_min, gamma, in_max, out_min, out_max)
  for y=0, height-1 do
    for x=0, width-1 do
      set_rgb(x,y, levels_get_rgb(x,y,
         in_min, gamma, in_max,
         out_min,       out_max))
    end
  end
  flush()
end

levels(0.2, 1.2, 0.8,
       0.0,      1.2)

Curves

Figure 4.10. The GIMP's curves tool

The GIMP's curves tool

A curves filter is a filter providing the user with a free form transformation of the grayscale values in an image.


The view is a plot of the function f(x) of the transform applied, this means that the darkest values in the image are on rthe left, and the brightest on the right. The resulting value for an original value is the y value.

A diagnoal line from lower left to upper right indicates no transformation.

A horizontal line, means all pixels get the same value, along the bottom black, and along the top white.

The transform indicated in Figure 4.10, “The GIMP's curves tool” makes most values in the image darker, but allocates a larger portion of the values available to the brightest portion of the scale. This could be used to see more detail in for instance the sky in a photograph.

Lookup tables

As can be observed in the code examples given in this chapter, there is direct mapping between input pixel values and output pixel values for point process operations.

When dealing with 8bit images, the number of input and output values are greatly reduced, if the processing taking place within the transform is expensive, having a precalculated table for all input values is a common optimizations, this is called a LookUpTable (LUT).

When the sample values stored are 16bit the advantage of a lookup table is still present since the number of entries is 65536, but when moving to 32bit floating point, the number of precalculated values needed is exceedingly large.

Excercises

These exercies are not just exercies for the point operation topic, but intended as initial experiments with the gluas framework.

1. Exchange Red and Green

Write a gluas program that exchanges the red and green component of all pixels in an image.

Is this a point operation?

2. Value boost

Create a program performing the transformation in Figure 4.10, “The GIMP's curves tool”.

3. Checkerboard

Create a program that draws a 8x8 checker board on top of the existing image. The checker board should be black and white, square, cover as large area as possible and be centered on the image.

4. Position dependent contrast

Create a program that has a linearly increasing contrast applied to it. So that the leftmost column of the raster has a contrast of 0.0 applied to it, and the rightmost column of the raster has a contrast of 2.0 applied to it.

Is this a point operation?

Chapter 5. Area operations

If thence he 'scappe, into whatever world, Or unknown region.

--Milton

In Chapter 4, Point operations we looked at operations that calculated the new sample values of a pixel based on the value of the original pixel alone. In this chapter we look at the extension of that concept, operations that take their input from a local neighbourhood of the original pixel.

Sampling area

We can sample with different input windows the most common are a square or circle centered on the pixel as shown in Figure 5.1, “Sliding window”.

Figure 5.1. Sliding window

Sliding window
The 3×3 blue area in the raster on the left, indicates which pixels contribute to the single red pixel in the destination image.

Edge handling

When sampling a local area around a pixel, we need to decide what to do when part of the sampling window is outside the source image. The common way to provide a "fake" value for data outside the image are:

  • Make them transparent.
  • Give them a color (for instance black).
  • Nearest valid pixel.
  • Mirror the image over the edge.
  • Wrap to other side of image.

In gluas the global variable edge_duplicate can be set to 1, this makes gluas return the nearest valid pixel instead of transparent/black.

Figure 5.2. Over the edge

Over the edge
The 3×3 blue area in the raster on the left, indicates which pixels contribute to the single red pixel in the destination image.

Convolution Matrix

A special group of operations are the kernel filters, which use a table of numbers (a matrix) as input. The matrix gives us the weight to be given each input sample. The matrix for a kernel filter is always square and the number of rows/columns are odd. Many image filters can be expressed as convolution filters some are listed in the sourcelisting in Figure 5.3, “convolve”.

Each sample in the sampling window is weighted with the corresponding weight in the convolution matrix. The resulting sum is divided by the divisor (for most operations this is assumed to be the sum of the weights in the matrix). And finally an optional offset might be added to the result.

Figure 5.3. convolve

convolve
-- 3x3 convolution filter

edge_duplicate=1;

function convolve_get_value (x,y, kernel, divisor, offset)
   local i, j, sum
   sum_r = 0   
   sum_g = 0   
   sum_b = 0   
   
   for i=-1,1 do
      for j=-1,1 do
         r,g,b = get_rgb (x+i, y+j)
         sum_r = sum_r + kernel[j+2][i+2] * r
         sum_g = sum_g + kernel[j+2][i+2] * g
         sum_b = sum_b + kernel[j+2][i+2] * b
      end
   end   
   return sum_r/divisor + offset, sum_g/divisor+offset, sum_b/divisor+offset
end

function convolve_value(kernel, divisor, offset)
for y=0, height-1 do
   for x=0, width-1 do      
       r,g,b = convolve_get_value (x,y, kernel, divisor, offset)
       set_rgb (x,y, r,g,b)
   end
   progress (y/height)
end
flush ()
end

function emboss()
    kernel = { { -2, -1,  0},
               { -1,  1,  1},
               {  0,  1,  2}}
    divisor = 1
    offset  = 0
    convolve_value(kernel, divisor, offset)
end

function sharpen()
    kernel = { { -1, -1, -1},
               { -1,  9, -1},
               { -1, -1, -1}}
    convolve_value(kernel, 1, 0)
end

function sobel_emboss()
    kernel = { { -1, -2, -1},
               {  0,  0,  0},
               {  1,  2,  1}}
    convolve_value(kernel, 1, 0.5)
end

function box_blur()
    kernel = { {  1,  1,  1},
               {  1,  1,  1},
               {  1,  1,  1}}
    convolve_value(kernel, 9, 0)
end

sharpen ()

Smoothing filters

Box blur

A convolution matrix with all weights set to 1.0 is a blur, blurring is an operation that is often performed and can be implemented programmatically faster than the general convolution. A blur can be seen as the average of an area, or a resampling which we'll return to in a later chapter.

Figure 5.4. blur

blur
-- a box blur filter implemented in lua

edge_duplicate = 1;

function blur_slow_get_rgba(sample_x, sample_y, radius)
   local x,y
   local rsum,gsum,bsum,asum=0,0,0,0
   local count=0

   for x=sample_x-radius, sample_x+radius do
     for y=sample_y-radius, sample_y+radius do
     
       local r,g,b,a= get_rgba(x,y)
       
       rsum, gsum, bsum, asum = rsum+r, gsum+g, bsum+b, asum+a
       
       count=count+1
     end
   end
   return rsum/count, gsum/count, bsum/count, asum/count
end

function blur_slow(radius)
    local x,y


    for y = 0, height-1 do
        for x = 0, width-1 do
           set_rgba (x,y, blur_slow_get_rgba (x,y,radius))
        end
        progress (y/height)
    end
    flush ()
end


-- seperation into two passes, greatly increases the speed of this
-- filter

function blur_h_get_rgba(sample_x, sample_y, radius)
   local x,y
   local rsum,gsum,bsum,asum=0,0,0,0
   local count=0
   
   y=sample_y
   for x=sample_x-radius, sample_x+radius do
       local r,g,b,a= get_rgba(x,y)
       
       rsum, gsum, bsum, asum = rsum+r, gsum+g, bsum+b, asum+a
       
       count=count+1
   end
   return rsum/count, gsum/count, bsum/count, asum/count
end

function blur_v(radius)
    for y = 0, height-1 do
        for x = 0, width-1 do
           set_rgba (x,y, blur_v_get_rgba(x,y,radius))
        end
        progress(y/height)
    end
    flush()
end

function blur_v_get_rgba(sample_x, sample_y, radius)
   local x,y
   local rsum,gsum,bsum,asum=0,0,0,0
   local count=0
   
   x=sample_x
   for y=sample_y-radius, sample_y+radius do
       local r,g,b,a= get_rgba(x,y)
       
       rsum, gsum, bsum, asum = rsum+r, gsum+g, bsum+b, asum+a
       
       count=count+1
   end
   return rsum/count, gsum/count, bsum/count, asum/count
end

function blur_h(radius)
    for y = 0, height-1 do
        for x = 0, width-1 do
           set_rgba (x,y, blur_h_get_rgba(x,y,radius))
        end
        progress(y/height)
    end
    flush()
end

function blur_fast(radius)
  blur_v(radius)
  blur_h(radius)
end

blur_fast(4)

Gaussian blur

A better looking blur is achieved by setting the weights to a gaussian distribution around the active pixel.

A gaussian blur works by weighting the input pixels near the center of ther sampling window higher than the input pixels further away.

One way to approximate a small gaussian blur, is to set the center pixel in a 3×3 matrix to 2, while the surrounding are set to 1, and set the weight to 10.

Speeding up blur by decomposing

A blur is a time consuming operation especially if the radius is large. A way to optimize a blur is to observe that first blurring horizontally and then vertically gives the same result as doing both at the same time. This has been done to the blur_fast in Figure 5.4, “blur” and can also be done for gaussian blurs.

Edge preserving

Edge preserving smoothing filters are useful for automated feature detection or even just for artistic effects.

Kuwahara

The kuwahara filter is an edge preserving blur filter. It works by calculating the mean and variance for four subquadrants, and chooses the mean value for the region with the smallest variance.

Figure 5.5. kuwahara

kuwahara
-- Kuwahara filter
--
-- 
-- Performs the Kuwahara Filter. This filter is an edge-preserving filter.
--
--
-- ( a  a  ab   b  b)
-- ( a  a  ab   b  b)
-- (ac ac abcd bd bd)
-- ( c  c  cd   d  d)
-- ( c  c  cd   d  d)
--
-- In each of the four regions (a, b, c, d), the mean brightness and the variance are calculated. The
-- output value of the center pixel (abcd) in the window is the mean value of that region that has the
-- smallest variance.
--
-- description copied from http://www.incx.nec.co.jp/imap-vision/library/wouter/kuwahara.html
--
-- implemented by Øyvind Kolås <oeyvindk@hig.no> 2004


-- the sampling window is:  width=size*2+1  height=size*2+1

size = 4

edge_duplicate = 1;

-- local function to get the mean value, and
-- variance from the rectangular area specified
function mean_and_variance (x0,y0,x1,y1)
 local variance
 local mean

 local min         = 1.0
 local max         = 0.0
 local accumulated = 0
 local count       = 0
   
 local x, y

 for y=y0,y1 do
   for x=x0,x1 do
     local v = get_value(x,y)

     accumulated   = accumulated + v
     count         = count + 1
      
     if v<min then min = v end
     if v>max then max = v end

    end
  end

  variance = max-min
  mean     = accumulated /count
   
  return mean, variance
end

-- local function to get the mean value, and
-- variance from the rectangular area specified
function rgb_mean_and_variance (x0,y0,x1,y1)
 local variance
 local mean
 local r_mean
 local g_mean
 local b_mean 
 
 local min           = 1.0
 local max           = 0.0
 local accumulated_r = 0
 local accumulated_g = 0
 local accumulated_b = 0
 local count         = 0
   
 local x, y

 for y=y0,y1 do
   for x=x0,x1 do
     local v     = get_value(x,y)
     local r,g,b = get_rgb (x,y)

     accumulated_r   = accumulated_r + r
     accumulated_g   = accumulated_g + g
     accumulated_b   = accumulated_b + b
     count         = count + 1
      
     if v<min then min = v end
     if v>max then max = v end

    end
  end

  variance = max-min
  mean_r   = accumulated_r /count
  mean_g   = accumulated_g /count
  mean_b   = accumulated_b /count
    
  return mean_r, mean_g, mean_b, variance
end

-- return the kuwahara computed value
function kuwahara(x, y, size)
  local best_mean     = 1.0
  local best_variance = 1.0

  local mean, variance
	
  mean, variance = mean_and_variance (x-size, y-size, x, y)	

  if variance < best_variance then
    best_mean     = mean
    best_variance = variance
  end
	
  mean, variance = mean_and_variance (x, y-size, x+size,y)

  if variance < best_variance then
    best_mean     = mean
    best_variance = variance
  end	
	
  mean, variance = mean_and_variance (x, y, x+size, y+size)

  if variance < best_variance then
    best_mean     = mean
    best_variance = variance
  end	
	
  mean, variance = mean_and_variance (x-size, y, x,y+size)

  if variance < best_variance then
    best_mean     = mean
    best_variance = variance
  end	

  return best_mean
end


-- return the kuwahara computed value
function rgb_kuwahara(x, y, size)
  local best_r, best_g, best_b
  local best_variance = 1.0

  local r,g,b, variance
	
  r,g,b, variance = rgb_mean_and_variance (x-size, y-size, x, y)	

  if variance < best_variance then
    best_r, best_g, best_b = r, g, b
    
    best_variance = variance
  end
	
  r,g,b, variance = rgb_mean_and_variance (x, y-size, x+size,y)

  if variance < best_variance then
    best_r, best_g, best_b = r, g, b
    best_variance = variance
  end	
	
  r,g,b, variance = rgb_mean_and_variance (x, y, x+size, y+size)

  if variance < best_variance then
    best_r, best_g, best_b = r, g, b
    best_variance = variance
  end	
	
  r,g,b, variance = rgb_mean_and_variance (x-size, y, x,y+size)

  if variance < best_variance then
    best_r, best_g, best_b = r, g, b
    best_variance = variance
  end	

  return best_r, best_g, best_b
end

function kuwahara(radius)
  for y=0, height-1 do
    for x=0, width-1 do
      r,g,b = rgb_kuwahara (x,y, radius)
      set_rgb (x,y, r,g,b)
    end
    progress (y/height)
  end
end

kuwahara(2)

Symmetric Nearest Neighbour

An improvement over the kuwahara filter is the symmetric nearest neighbour filter that compares symmetric pairs/quadruples of pixels with the center pixel and only takes into account the one from this set which is closest in value to the center pixel.

Figure 5.6. snn

snn
-- Symmetric Nearest Neighbour as described by Hardwood et al in Pattern
-- recognition letters 6 (1987) pp155-162.

function snn(radius)
local x,y
for y=0, height-1 do
	for x=0, width-1 do
	    local sum=0.0
	    local count=0.0
	    local u,v
	    
	    local valueC = get_value (x,y)
	    
	    for v=-radius,radius do
	      for u=-radius,radius do
	         local valueA = get_value (x+u,y+v)
	         local valueB = get_value (x-u,y-v)
	         
	         if (math.abs(valueC-valueA) <
	             math.abs(valueC-valueB)) then
               sum = sum + valueA
	         else
               sum = sum + valueB	         
	         end
	         count = count + 1	         
	      end
	    end
		set_value (x,y, sum/count)
	end
	progress (y/height)
end
flush()
end

-- snn(6)

-- and a color version operating in CIE Lab 

function deltaE(l1,a1,b1,l2,a2,b2)
  return math.sqrt( (l1-l2)*(l1-l2) + (a1-a2)*(a1-a2) + (b1-b2)*(b1-b2))
end

function snn_color(radius)
local x,y
for y=0, height-1 do
	for x=0, width-1 do
	    local sumL=0.0
	    local suma=0.0
	    local sumb=0.0
	    local count=0.0
	    local u,v
	    
	    local lc,ac,bc = get_lab (x,y)
	    
	    for v=-radius,radius do
	      for u=-radius,radius do
			 local l1,a1,b1 = get_lab (x+u,y+v)	         
	         local l2,a2,b2 = get_lab (x-u,y-v)
	         
	         if (deltaE(lc,ac,bc,l1,a1,b1) <
	             deltaE(lc,ac,bc,l2,a2,b2)) then	             
               sumL = sumL + l1
               suma = suma + a1
               sumb = sumb + b1
	         else
	           sumL = sumL + l2
               suma = suma + a2
               sumb = sumb + b2	         
	         end
	         count = count + 1	         
	      end
	    end
		set_lab (x,y, sumL/count, suma/count, sumb/count)
	end
	progress (y/height)
end
flush()
end

snn_color(6)

Rank/statistical filters

Rank filters operate statistically on the neighbourhood of a pixel. The tree most common rank filters are median, minimum, maximum.

Median

The median filter sorts the sample values in theneighbourhood, and then picks the middle value. The effect is that noise is removed while detail is kept. See also kuwahara. The median filter in The GIMP is called Despeckle [5].

Figure 5.7. median

median
-- a median rank filter

edge_duplicate = 1;

function median_value(sample_x, sample_y, radius)
   local x,y
   local sample = {}
   local samples = 0

   for x=sample_x-radius, sample_x+radius do
     for y=sample_y-radius, sample_y+radius do
     
       local value = get_value(x,y)

       sample[samples] = value
       samples = samples + 1
     end
   end

   table.sort (sample)
   local mid = math.floor(samples/2)
   if math.mod(samples,2) == 1 then
       median = sample[mid+1]
   else
       median = (sample[mid]+sample[mid+1])/2
   end

   return median
end

function median(radius)
    local x,y

    for y = 0, height-1 do
        for x = 0, width-1 do
           set_value (x,y, median_value (x,y,radius))
        end
        progress (y/height)
    end
    flush ()
end

median(3)

Unsharp mask

Unsharp mask is a quite novel way to sharpen an image, it has it's origins in darkroom work. And is a multistep process.

  1. create a duplicate image

  2. blur the duplicate image

  3. calculate the difference between the blurred and the original image

  4. add the difference to the original image

The unsharp mask process works by exaggerating the mach band effect [6].

Misc area filters.

In this section some stranger filters are listed, these filters are not standard run of the mill image filters. But simple fun filters that fall into the category of area filters.

Jitter

Pick an a random pixel within the sampling window.

Figure 5.8. jitter

jitter
-- a jittering filter implemented in lua

edge_duplicate=1 -- set edge wrap behavior to use nearest valid sample

function jitter(amount)
  local x,y
  var2=var1
  for y = 0, height-1 do
    for x = 0, width-1 do
      local nx,ny
      local r,g,b,a
      nx=x+(math.random()-0.5)*amount
      ny=y+(math.random()-0.5)*amount
      r,g,b,a=get_rgba(nx,ny)
      set_rgba(x,y, r,g,b,a)
    end
    progress(y/height)
  end
end

jitter(10)

Excercises

1. Extend the convolve example

Extend the convolve example, making it possible for it to accept a 7×7 matrix for it's values, add automatic summing to find the divisor and experiment with various sampling shapes.

2. Adaptive blur

Create a filter that returns the original pixel value if the variance of the entire sample window is larger than a provided value. How does this operation compare to kuwahara/snn?

3. Circle

Create a filter that renders a white circle in the middle of the image, the background of the circle should be black.



[5] The despeckle filter for gimp is more advanced than a simple median filter, it also allows for being adaptive doing more work in areas that need more work, it can be set to work recursivly. For normal median operation adaptive and recursive should be switched off, the black level set to -1 and the white level to 256.

[6] The mach band effect is an analog sampling artifact that is created in the human visual system. The machband effect is used by the human mind to recognize edges in an image.

Chapter 6. Resampling

I want friends to motion such matter.

--Burton

Up until now, the image has remained in a fixed position, in this chapter we take a look at how we can start repositioning samples.

Affine transformations

Affine transformations are transformations to an image that preserves colinearity, all points on a line remains on a line after transformation. Usually affine transformations are expressed as an matrices, in this text to keep the math level down, I will use geometry and vectors instead.

Translation

Translation can be described as moving the image, e.g. move all the contents of the image 50pixels to the right and 50pixels down.

Instead of traversing the original image, and placing the pixels in their new location we calculate which point in the source image ends up at the coordinate calculated.

Figure 6.1. translation

translation
function translate(dx,dy)
  for y=0, height-1 do
    for x=0, width-1 do
      -- calculate the source coordinates (u,v)
      u = x-dx
      v = y-dy
       
      r,g,b=get_rgba(u,v)
      set_rgb(x,y,r,g,b)
    end
    progress (y/height)
  end
  flush()
end

translate(width*0.25,height*0.25)

Rotation

Rotation is implemented by rotating the unit vectors of the coordinate system, and from the unit vectors calculate the new positions of a given input coordinate, to make the transformation happen in reverse it is enough to negate the angle being passed to our function.

Figure 6.2. rotation

rotation
function rotate(angle)
  for y=0, height-1 do
    for x=0, width-1 do
      -- calculate the source coordinates (u,v)
      u = x * math.cos(-angle) + y * math.sin(-angle)
      v = y * math.cos(-angle) - x * math.sin(-angle)
       
      r,g,b = get_rgba(u,v)
      set_rgb(x,y,r,g,b)
    end
    progress (y/height)
  end
  flush()
end

rotate(math.rad(15))

Scaling

Scaling an image is a matter of scaling both the x and y axes according to a given scale factor when retrieving sample values.

Figure 6.3. scaling

scaling
function scale(ratio)
  for y=0, height-1 do
    for x=0, width-1 do
      -- calculate the source coordinates (u,v)
      u = x * (1.0/ratio)
      v = y * (1.0/ratio)
       
      r,g,b = get_rgba(u,v)
      set_rgb(x,y,r,g,b)
    end
    progress (y/height)
  end
  flush()
end

scale(3)

Other transformations

Perspective transform

Perspective transformations are useful for texture mapping, texture mapping is outside the scope of this text. Affine transformations are a subset of perspective transformations.

To be able to do texturemapping for use in realtime 3d graphics, the algorithm is usually combined with a routine that only paints values within the polygon defined, such a routine is called a polyfiller.

Lens correction

When stitching multiple images to create a panorama/ fake fish-eye image, correction for the lens is needed, the only difference compared to the earlier mentioned affine transformations are that the parameters used to calculate the distortion are based on properties of the lens.

Interpolation

When doing rotation as well as scaling, the coordinates we end up getting pixel values from are not neccesary integer coordinates. When enlarging this problem is especially apparent in Figure 6.3, “scaling” the problem is quite apparent we have "enlarged" each sample (pixel) to cover a larger area, this results in blocky artifacts within the image.

A more correct approach is to make an eduacted guess about the missing value at the sampling coordinates, this means attempting to get the results we would have gotten by sampling that coordinate in the original continous analog image plane.

The method used in the affine samples above is called "nearest neighbour", getting the closest corresponding pixel in the source image.

Bilinear interpolation

Bilinear sampling makes the assumption that values in the image changes continously.

Figure 6.4. bilinear

bilinear
function get_rgb_bilinear (x,y)
  local x0,y0 -- integer coordinates
  local dx,dy -- offset from coordinates
  local r0,g0,b0, r1,g1,b1, r2,g2,b2, r3,g3,b3
  local r,g,b

  x0=math.floor(x)
  y0=math.floor(y)
  dx=x-x0
  dy=y-y0
  
  r0,g0,b0 = get_rgb (x0  ,y0  )
  r1,g1,b1 = get_rgb (x0+1,y0  )
  r2,g2,b2 = get_rgb (x0+1,y0+1)
  r3,g3,b3 = get_rgb (x0  ,y0+1)

  r = lerp (lerp (r0, r1, dx), lerp (r3, r2, dx), dy)
  g = lerp (lerp (g0, g1, dx), lerp (g3, g2, dx), dy)
  b = lerp (lerp (b0, b1, dx), lerp (b3, b2, dx), dy)
 
  return r,g,b
end

function scale(ratio)
  for y=0, height-1 do
    for x=0, width-1 do
      -- calculate the source coordinates (u,v)
      u = x * (1.0/ratio)
      v = y * (1.0/ratio)
      r,g,b=get_rgb_bilinear(u,v)
      set_rgb(x,y,r,g,b)
    end
    progress (y/height)
  end
  flush()
end

function lerp(v1,v2,ratio)
   return v1*(1-ratio)+v2*ratio;
end
-- LERP
-- /lerp/, vi.,n.
--
-- Quasi-acronym for Linear Interpolation, used as a verb or noun for
-- the operation. "Bresenham's algorithm lerps incrementally between the
-- two endpoints of the line." (From Jargon File (4.4.4, 14 Aug 2003)
scale(3)

Bicubic interpolation

Figure 6.5. bicubic

bicubic
function get_rgb_cubic_row(x,y,offset)
  local r0,g0,b0, r1,g1,b1, r2,g2,b2, r3,g3,b3
  r0,g0,b0 = get_rgb(x,y)
  r1,g1,b1 = get_rgb(x+1,y)
  r2,g2,b2 = get_rgb(x+2,y)
  r3,g3,b3 = get_rgb(x+3,y)
  return cubic(offset,r0,r1,r2,r3), cubic(offset,g0,g1,g2,g3), cubic(offset,b0,b1,b2,b3)
end

function get_rgb_bicubic (x,y)
  local xi,yi -- integer coordinates
  local dx,dy -- offset from coordinates
  local r,g,b

  xi=math.floor(x)
  yi=math.floor(y)
  dx=x-xi
  dy=y-yi

  r0,g0,b0 = get_rgb_cubic_row(xi-1,yi-1,dx)
  r1,g1,b1 = get_rgb_cubic_row(xi-1,yi,  dx)
  r2,g2,b2 = get_rgb_cubic_row(xi-1,yi+1,dx)
  r3,g3,b3 = get_rgb_cubic_row(xi-1,yi+2,dx)

  return cubic(dy,r0,r1,r2,r3),
         cubic(dy,g0,g1,g2,g3),
         cubic(dy,b0,b1,b2,b3)
end

function scale(ratio)
  for y=0, height-1 do
    for x=0, width-1 do
      -- calculate the source coordinates (u,v)
      u = x * (1.0/ratio)
      v = y * (1.0/ratio)
      r,g,b=get_rgb_bicubic(u,v)
      set_rgb(x,y,r,g,b)
    end
    progress (y/height)
  end
  flush()
end

function cubic(offset,v0,v1,v2,v3)
  -- offset is the offset of the sampled value between v1 and v2
   return   (((( -7 * v0 + 21 * v1 - 21 * v2 + 7 * v3 ) * offset +
               ( 15 * v0 - 36 * v1 + 27 * v2 - 6 * v3 ) ) * offset +
               ( -9 * v0 + 9 * v2 ) ) * offset + (v0 + 16 * v1 + v2) ) / 18.0;
end

scale(3)

Decimation

When scaling an image down the output contains fewer pixels (sample points) than the input. We are not guessing the value of in-between locations, but combining multiple input pixels to a single output pixel.

One way of doing this is to use a box filter and calculate a weighted average of the contribution of all pixel values.

Figure 6.6. box decimate

box decimate
function get_rgb_box(x0,y0,x1,y1)
  local area=0 -- total area accumulated in pixels 
  local rsum,gsum,bsum = 0,0,0
  local x,y
  local xsize, ysize

  for y=math.floor(y0),math.ceil(y1) do
    ysize = 1.0
    if y<y0 then
      size = size * (1.0-(y0-y))
    end
    if y>y1 then
      size = size * (1.0-(y-y1))
    end
    for x=math.floor(x0),math.ceil(x1) do
      size = ysize
      r,g,b = get_rgb(x,y)
      if x<x0 then
        size = size * (1.0-(x0-x))
      end
      if x>x1 then
        size = size * (1.0-(x-x1))
      end
      rsum,gsum,bsum = rsum + r*size, gsum + g*size, bsum + b*size
      area = area+size
    end
  end
  
  return rsum/area, gsum/area, bsum/area
end

function scale(ratio)
  for y=0, (height*ratio)-1 do
    for x=0, (width*ratio)-1 do
      u0 = x * (1.0/ratio)
      v0 = y * (1.0/ratio)
      u1 = (x+1) * (1.0/ratio)
      v1 = (y+1) * (1.0/ratio)
      r,g,b=get_rgb_box(u0,v0,u1,v1)
      set_rgb(x,y,r,g,b)
    end
    progress (y/(height*ratio))
  end
  flush()
end

scale(0.33)

Trilinear interpolation

With OpenGL and other realtime 3d engines it is common to use mipmaps to increase processing speed and the quality of the resampled texture.

A mipmap is a collection of images of prescaled sizes. The prescaled sizes are 1.0×image dimensions, 0.5×image dimensions, 0.25×image dimensions, 0.125×image dimensions etc.

When retrieving a sample value a bilinear interpolation would be performed at both the version of the image larger than the scaling ratio as well as the image smaller than the scaling ratio. After that a linear interpolation is performed on the two resulting values based on which of the two versions most closely matches the target resolution.

Excercises

1. Interpolation error visualization

Using The GIMP open a test image, create a duplicate of the original image. (Image/Duplicate). Scale the duplicate image to 25% size (Image/Scale). Scale the duplicate image up 400%. Copy the newly scaled image as a layer on top of the original image. (Ctrl+C, change image Ctrl+V), and press the new layer button when the original image is active. Set the mode of the top layer to difference, Flatten the image (Image/Flatten Image), and automatically stretch the contrast of the image. (Layer/Colors/Auto/Stretch Contrast).

Do this for nearest neighbour, bilienar and bicubic interpolation and compare the results.

2. Study the assignment

Read Appendix A, Assignment, spring 2007 and perhaps make initial attempts at demosaicing the Bayer pattern.

Chapter 7. Automatic color adjustments

Any sufficiently advanced technology is indistinguishable from magic.

--Arthur C. Clarke

Algorithms for automatic adjustments attempts to automatically enhance the perceptual quality of an image. Such algorithms are usually used to enhance the perceived quality of images.

Digital cameras, often use such algorithms after aquisition, to make the most out of the data captured by the light sensor.

Companies providing print services for consumers, both developing analog film and scanning it; as well as companies making prints directly from digital files. The selection and tuning of parameters are based on automatically categorizing the image as a portrait, landscape, sunset/sunrise etc.

To achieve the best results possible, manual tuning and adjustments are often needed, but for a normal private use, like holiday pictures automatic enhancements makes higher quality easy, and point and shoot photography possible.

Contrast stretching

Contrast stretching makes sure the input sample with the lowest value is mapped to black; and that the one with the highest to 1.0. The values are recalculated by using linear interpolation.

Contrast stretching automatically counters under and over exposure, as well as the ability to extend the used luminance range.

It is also possible to restrict the algorithm towards the center of the distribution, perhaps with a cut-off value, this allows for a more noise tolerant adjustment.

Figure 7.1. contrast stretching

contrast stretching
function get_min_max()
    min, max = 1000, -1000          
    for y=0, height-1 do
      for x=0, width-1 do
        value = get_value(x,y)
        if value<min then
          min = value
        end
        if value>max then
          max = value
        end
      end  
    end
    return min,max
end

function remap(v, min, max)
    return (v-min) * 1.0/(max-min)
end

function cs_get_rgb(x,y,min,max)
    r,g,b = get_rgb(x,y)
    r = remap(r, min, max)
    g = remap(g, min, max)
    b = remap(b, min, max)
    return r,g,b
end

function contrast_stretch()
    min, max = get_min_max()

    for y=0, height do
      for x=0, width do
        set_rgb(x,y, cs_get_rgb(x,y,min,max))
      end
    end
    flush ()
end

contrast_stretch()

Automatic white balance

In the day of analog point and shoot photography, the white balance of the images were set by the photolab. With digital photography the whitebalance has to either be pre set by the photographer by measurement or guess, or be guessed by algorithms in camera or computer software.

Grayworld assumption

Assuming that we have a good distribution of colors in our scene, the average reflected color should be the color of the light. If the light source is assumed to be white, we know how much the whitepoint should be moved in the color cube.

The compensation calculated by the gray world assumption is an approximation of the measurement taken by digital still and video cameras by capturing an evenly lit white sheet of paper or similar.

Figure 7.2. grayworld assumption

grayworld assumption
function get_avg_a_b ()
    sum_a=0
    sum_b=0

    -- first find average color in CIE Lab space
    for y=0, height-1 do
      for x=0, width-1 do
        l,a,b = get_lab(x,y)
        sum_a, sum_b = sum_a+a, sum_b+b
      end
      progress(y/height)
    end

    avg_a=sum_a/(width*height)
    avg_b=sum_b/(width*height)

    return avg_a,avg_b
end
 
function shift_a_b(a_shift, b_shift)
    for y=0, height do
      for x=0, width do
        l,a,b = get_lab(x,y)
        
        -- scale the chroma distance shifted according to amount of
        -- luminance. The 1.1 overshoot is because we cannot be sure
        -- to have gotten the data in the first place.
        a_delta = a_shift * (l/100) * 1.1
        b_delta = b_shift * (l/100) * 1.1
        a,b = a+a_delta, b+b_delta

        set_lab(x,y,l,a,b)
      end
      progress(y/height)
    end
    flush()
end

function grayworld_assumption()
  avg_a, avg_b = get_avg_a_b()
  shift_a_b(-avg_a, -avg_b)
end

grayworld_assumption()

Component stretching

Component stretching makes the assumption that either direct reflections, or glossy reflections from surfaces can be found in the image, and that it is amongst the brightest colors in the image. By stretching each R,G and B component to it's full range, as in the section called “Contrast stretching” often leads to a better result than grayworld assumption.

If the image is overexposed, this technique does not work, not having a very reflective object will also bias the results in a non desireable way.

Figure 7.3. component stretching

component stretching
function get_min_max_r ()
    min, max = 1000, -1000          
    for y=0, height-1 do
      for x=0, width-1 do
        value, temp, temp = get_rgb (x,y)
        if value<min then
          min = value
        end
        if value>max then
          max = value
        end
      end  
    end
    return min,max
end

function get_min_max_g ()
    min, max = 1000, -1000          
    for y=0, height-1 do
      for x=0, width-1 do
        temp, value, temp = get_rgb (x,y)
        if value<min then
          min = value
        end
        if value>max then
          max = value
        end
      end  
    end
    return min,max
end

function get_min_max_b ()
    min, max = 1000, -1000          
    for y=0, height-1 do
      for x=0, width-1 do
        temp, temp, value = get_rgb (x,y)
        if value<min then
          min = value
        end
        if value>max then
          max = value
        end
      end  
    end
    return min,max
end

function remap(v, min, max)
    return (v-min) * 1.0/(max-min)
end

function cs_get_rgb(x,y,min_r,max_r,min_g,max_g,min_b,max_b)
    r,g,b = get_rgb(x,y)
    r = remap(r, min_r, max_r)
    g = remap(g, min_g, max_g)
    b = remap(b, min_b, max_b)
    return r,g,b
end

function component_stretch()
    min_r, max_r = get_min_max_r ()
    min_g, max_g = get_min_max_g ()
    min_b, max_b = get_min_max_b ()

    for y=0, height do
      for x=0, width do
        set_rgb(x,y, cs_get_rgb(x,y,min_r,max_r, min_g, max_g, min_b, max_b))
      end
    end
    flush ()
end

component_stretch()

Automatic spatial algorithms

Local Contrast Enhancement

ACE and Retinex are algorithms that performs color balancing and color enhancements baded on statistical measures of the spatial neighbourhood of each pixel; doing a kind of local contrast stretching. These algorithms aim to model some of the automatic adaptation that happens in the human visual system.

Figure 7.4. lce

lce
edge_duplicate = 1;

function get_min_max (x0,y0,x1,y1)
    min_r, max_r = 1000, -1000          
    min_g, max_g = 1000, -1000          
    min_b, max_b = 1000, -1000          
    for y=y0,y1 do
      for x=x0,x1 do
        r, g, b= get_rgb (x,y)
        if r<min_r then min_r = r end if r>max_r then max_r = r end
        if g<min_g then min_g = g end if g>max_g then max_g = g end
        if b<min_b then min_b = b end if b>max_b then max_b = b end
      end  
    end
    return min_r,max_r,min_g,max_g,min_b,max_b
end

function remap(v, min, max)
    return (v-min) * 1.0/(max-min)
end

function lce_get_rgb(x,y,radius)
    min_r, max_r, min_g, max_g, min_b, max_b =
          get_min_max (x-radius,y-radius,x+radius,y+radius)
    r,g,b = get_rgb(x,y)
    r = remap(r, min_r, max_r)
    g = remap(g, min_g, max_g)
    b = remap(b, min_b, max_b)
    return r,g,b
end

function lce(radius)

    for y=0, height do
      for x=0, width do
        set_rgb(x,y, lce_get_rgb(x,y,radius))
      end
      progress (y/height)
      print (y/height)
    end
    flush ()
end

lce(32)

[Note]Note

This implementation of an local contrast enhancement doesn't really show the potential of this type of algorithms but the implementation is kept simple to be readable and understandable.

Excercises

1. Problems with the Local Contrast Enhancement filter

The artifacts appearing in Figure 7.4, “lce” are due to the sampling of the pixels used to find the maximum and minimum values to do contrast stretching with. How can the filter be improved to give a smoother appearance?

Chapter 8. Countering artifacts

Artifact: Technology an object, oservation, phenomenon, or result arising from hidden or unexpected causes extraneous to the subject of a study, and therefore spurious and having potential to lead one to an erroneous conclusion, or to invalidate the study. In experimental science, artifacts may arise due to inadvertant contamination of equipment, faulty experimental design or faulty analysis, or unexpected effects of agencies not known to affect the system under study.

--The Collaborative International Dictionary of English v.0.48 [gcide]

Static sensor noise

Some sensors have permanently stuck pixels, or pixels that have a different sensitivity than others. This is a problem that can be fixed by having a "blank" exposure that records the difference between pixels. The pattern found by taking this blank shot can be used to correct both the intensity of recorded values, as well as replacing known to be bad pixels. This is usually called "dark frame" subtraction. This is mostly applicable to very well controlled exposures with stable sensors. Some forms of scientific imaging like astro-photography with sensors that are kept at a constant temperature can make use of dark frame removal.

Some digital cameras keep a track of bad pixels themselves, when getting a RAW file, the bad pixels are usually not retouched.

Film grain

Analog film effectivly has many small light sensitive particles that act as sensors. With fast films, that need little light, the density of these particles is low, which lead to a grainy texture on the images.

Film grain is countered by various lowpass filters, like blurring and median. Other more advanced filters reconstructs vector fields of "flow" in the image.

Note that sometimes the goal is to create noise to match original film grain, this can be achieved by analysis of blank frames of video to extract patterns with statistical properties similar to the film-grain.

CCD sampling noise

CCD's have the same problems as physical film except the noise appears to be even more random. Various noise reduction techniques like bilateral filtering that preserves larges changes while smoothing out small variations.

Quantification

The quantification process can introducate artifacts. Increase in contrast and colorspace conversions can amplify quantification artifacts. The old adage, GIGO, Garbage In, Garbage Out. there isn't very much to do with data that is bad from the outset.

Interlacing

Old analog TV equipment did a neat trick to allow for both a high temporal rate as well as a high number of lines. Only every second line were transmitted, first the odd lines, then the even lines. This way of displaying images only work well for CRT monitors. For LCDs an interlaced image must be resampled to avoid combing artifacts.

Line doubling

Line doubling is the simplest form of deinterlacing, it replaces the odd lines with the even lines. This gets rid of the interlacing effect, but reduces the vertical resolution in half.

Linear blend

"Linear Blend operates by taking a line, then averaging the pixel values in it with the line below, effectively blurring the frame. This almost completely eliminates the effects of interlacing - at times you may notice slight ghosting (instant transition from one thing to another), the image appears to persist for a fraction of a second. This is absolutely not a serious issue, as far as I am concerned - in fact it's really only noticeable if you are me, and a perfectionist." [quote from http://xaw-deinterlace.sourceforge.net/index.php?sec=about]

Temporal resampling

By reconstructing motion vectors for the video frames, it is possible to predict the position of object between frames, thus providing a volume video object where any position can be sampled. Such an advanced approach can be used for both deinterlacing and high quality time stretching.

Aliasing

Aliasing artifacts, result from a spatial sampling frequency that is too low. A common example is moire effect showing up on textiles, like tweed, with detailed repeating patterns like tweed.

One way to counter aliasing is blurring the image. If we have control over the creation of the problematic image, we can also render a high resolution version that is downsampled to the target resolution using decimation.

Color casts

Color casts occur when a the white balance isn't correctly calibrated, usually white balance compensation is done on the raw, higher bitdepth, data from the CCD, to avoid quantization loss in the progress. For more detail on color casts see the previous chapter on automatic algorithms.

Appendix A. Assignment, spring 2007

Introduction

The assignment is due Wed 2007-05-09 12:00.

Goal

You are to implement a program that takes the data from the CMOS sensor, and generates an as good RGB image as possible.

Report

You will write a short report about the strategy taken and how you have implemented it, the report shall be delvivered as a zipped folder through classfronter. The zip file should containing a folder with the source code for your implementation, an <tt>index.html</tt> file containing the main text of the report that also includes the result of processing 3 or more of the sample images using your solution.

Dataset

Your images should attempt creating as good color images as possible from the following Bayer pattern screened grayscale images.

Hints

  • Interpolation for bayer Matrix

  • White balance

  • Contrast stretching

  • Noise