Basic Image Processing

27 Jan 2014 Article Programming Image Processing Kernel Colour Conversion Image Processing

In my most recent Introduction to Computer Programming (aka CS101), the instructor decided to shows us very simple image processing: blurring and secret image hiding. There are quite a number of things he did not mention because they are too complicated, but I think someone might be interested so I decided to write it here.

I am not talking about how to load image or save image, you can easily find documentation online for your favourite language. Our professors talk about two algorithm in class: blurring and secret image hiding. I am going to cover only these two here.

First, how to blur an image. The most simple algorithm is to take an average of 8 neighbouring pixels and itself, which he demonstrated. But what can we go from this? Blurring can be also defined as a “convolution kernel”. Convolution kernel (more info on Wikipedia, in term of image processing, is a matrix that get convolved with the image. For example, this blurring algorithm can also be written as:

$$ M = \begin{bmatrix}1\over{9} & 1\over{9} & 1\over{9} \\[0.3em] 1\over{9} & 1\over{9} & 1\over{9} \\[0.3em] 1\over{9} & 1\over{9} & 1\over{9}\end{bmatrix} $$

More complicated kernel for other task can be easily find on Wikipedia.

Next, about secret image hiding algorithm. The algorithm is simple: store a greyscale image inside LSB of another image. But this actually provide quite a few problem: first, converting image to greyscale and preserve luminance is not easy; second, since 8 bit isn’t divisible by three, some colour channel of hiding image may need to carry more bits than other; and three, lossy image compression can kill data hide by this technique.

First, to convert color to greyscale. Our professors provide a very basic formula to do this, while easy, sadly very inaccurate:

$$ Y = {{R+G+B}\over{3}} $$

This does not preserve luminance at all, because of how our eyes work each colour have different contribution to total luminosity. And to add more problem, sRGB colourspace (which is what most image in) is gamma-compressed, which mean the value does not scale linearly to actual luminosity.

So to convert sRGB image to greyscale while preserving luminosity, first we need to convert the RGB to linear scale (R’G’B’), as defined by sRGB specification:

$$ C'=\begin{cases}\frac{C}{12.92}, & C\le0.0405\\\left(\frac{C+0.055}{1.055}\right)^{2.4}, & C\gt0.0405 \end{cases} $$

Or in Java code (as that is what we are learning):

    public double decompressGamma(int color) {
        if (color < 0 || color > 255)
            return Double.NaN; // invalid number

        double c = color/255d;

        if (c <= 0.0405)
            return c/12.92;
        else
            return Math.pow((c+0.055)/1.055, 2.4);
    }

(Note that all colour value are in maths formula are [0, 1] while in Java code the integer are [0, 255] and double are [0, 1])

So we have linear colour values nicely in double now. Next is to create the Y (luminance), which according to sRGB is defined by:

$$ Y' = 0.2126 R' + 0.7152 G' + 0.0722 B' $$

Or in Java code:

    public double toLuminance(double r, double g, double b) {
        return 0.2126*r+0.7152*g+0.0722*b;
    }

Next, we must compress final luminance $ x^2 = 1\over{2} $ value back to gamma-compressed value, by inversing the decompress operation above:

$$ Y=\begin{cases}12.92\ Y', & Y' \le 0.0031308\\1.055\ {Y'}^{\frac{1}{2.4}}-0.055, & Y' > 0.0031308. \end{cases}$$

Or in Java code:

    public int compressGamma(double color) {
        if (color < 0d || color > 1d)
            return -1; // invalid number

        double result = -1;

        if (color <= 0.0031308)
            result = color*12.92;
        else
            result = Math.pow(color, 1d/2.4)*1.055-0.055;

        return (int) Math.round(result);
    }

There, we finally have luminance-preserved greyscale image!

Next is how we hide the data in other image. If you look at the above luminance combining factor, you will notice that the factor for G is huge while B is very small. This is because our eye percept green very good, and percept blue very bad. We need to split 8 bits into three sections: using this information, I think we should split bits to be stored in (R,G,B) as (2,2,4) or (3,1,4). This is to maximise green data and minimise blue data precision.

Finally, one must know that lossy image compression such as JPEG works on same principle as us: the LSB part of image is not so important, so these areas got destroyed by it. Thus, you must save your image to lossless format, for example BMP, PNG or TIF.

That’s all for today, thank you for reading!

innocenat

On Programming and Things

Basic Image Processing