Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient ways to determine tilt of an image

I'm trying to write a program to programmatically determine the tilt or angle of rotation in an arbitrary image.

Images have the following properties:

  • Consist of dark text on a light background
  • Occasionally contain horizontal or vertical lines which only intersect at 90 degree angles.
  • Skewed between -45 and 45 degrees.
  • See this image as a reference (its been skewed 2.8 degrees).

So far, I've come up with this strategy: Draw a route from left to right, always selecting the nearest white pixel. Presumably, the route from left to right will prefer to follow the path between lines of text along the tilt of the image.

Here's my code:

private bool IsWhite(Color c) { return c.GetBrightness() >= 0.5 || c == Color.Transparent; }

private bool IsBlack(Color c) { return !IsWhite(c); }

private double ToDegrees(decimal slope) { return (180.0 / Math.PI) * Math.Atan(Convert.ToDouble(slope)); }

private void GetSkew(Bitmap image, out double minSkew, out double maxSkew)
{
    decimal minSlope = 0.0M;
    decimal maxSlope = 0.0M;
    for (int start_y = 0; start_y < image.Height; start_y++)
    {
        int end_y = start_y;
        for (int x = 1; x < image.Width; x++)
        {
            int above_y = Math.Max(end_y - 1, 0);
            int below_y = Math.Min(end_y + 1, image.Height - 1);

            Color center = image.GetPixel(x, end_y);
            Color above = image.GetPixel(x, above_y);
            Color below = image.GetPixel(x, below_y);

            if (IsWhite(center)) { /* no change to end_y */ }
            else if (IsWhite(above) && IsBlack(below)) { end_y = above_y; }
            else if (IsBlack(above) && IsWhite(below)) { end_y = below_y; }
        }

        decimal slope = (Convert.ToDecimal(start_y) - Convert.ToDecimal(end_y)) / Convert.ToDecimal(image.Width);
        minSlope = Math.Min(minSlope, slope);
        maxSlope = Math.Max(maxSlope, slope);
    }

    minSkew = ToDegrees(minSlope);
    maxSkew = ToDegrees(maxSlope);
}

This works well on some images, not so well on others, and its slow.

Is there a more efficient, more reliable way to determine the tilt of an image?

like image 720
Juliet Avatar asked Sep 17 '09 21:09

Juliet


2 Answers

I've made some modifications to my code, and it certainly runs a lot faster, but its not very accurate.

I've made the following improvements:

  • Using Vinko's suggestion, I avoid GetPixel in favor of working with bytes directly, now the code runs at the speed I needed.

  • My original code simply used "IsBlack" and "IsWhite", but this isn't granular enough. The original code traces the following paths through the image:

    http://img43.imageshack.us/img43/1545/tilted3degtextoriginalw.gif

    Note that a number of paths pass through the text. By comparing my center, above, and below paths to the actual brightness value and selecting the brightest pixel. Basically I'm treating the bitmap as a heightmap, and the path from left to right follows the contours of the image, resulting a better path:

    http://img10.imageshack.us/img10/5807/tilted3degtextbrightnes.gif

    As suggested by Toaomalkster, a Gaussian blur smooths out the height map, I get even better results:

    http://img197.imageshack.us/img197/742/tilted3degtextblurredwi.gif

    Since this is just prototype code, I blurred the image using GIMP, I did not write my own blur function.

    The selected path is pretty good for a greedy algorithm.

  • As Toaomalkster suggested, choosing the min/max slope is naive. A simple linear regression provides a better approximation of the slope of a path. Additionally, I should cut a path short once I run off the edge of the image, otherwise the path will hug the top of the image and give an incorrect slope.

Code

private double ToDegrees(double slope) { return (180.0 / Math.PI) * Math.Atan(slope); }

private double GetSkew(Bitmap image)
{
    BrightnessWrapper wrapper = new BrightnessWrapper(image);

    LinkedList<double> slopes = new LinkedList<double>();

    for (int y = 0; y < wrapper.Height; y++)
    {
        int endY = y;

        long sumOfX = 0;
        long sumOfY = y;
        long sumOfXY = 0;
        long sumOfXX = 0;
        int itemsInSet = 1;
        for (int x = 1; x < wrapper.Width; x++)
        {
            int aboveY = endY - 1;
            int belowY = endY + 1;

            if (aboveY < 0 || belowY >= wrapper.Height)
            {
                break;
            }

            int center = wrapper.GetBrightness(x, endY);
            int above = wrapper.GetBrightness(x, aboveY);
            int below = wrapper.GetBrightness(x, belowY);

            if (center >= above && center >= below) { /* no change to endY */ }
            else if (above >= center && above >= below) { endY = aboveY; }
            else if (below >= center && below >= above) { endY = belowY; }

            itemsInSet++;
            sumOfX += x;
            sumOfY += endY;
            sumOfXX += (x * x);
            sumOfXY += (x * endY);
        }

        // least squares slope = (NΣ(XY) - (ΣX)(ΣY)) / (NΣ(X^2) - (ΣX)^2), where N = elements in set
        if (itemsInSet > image.Width / 2) // path covers at least half of the image
        {
            decimal sumOfX_d = Convert.ToDecimal(sumOfX);
            decimal sumOfY_d = Convert.ToDecimal(sumOfY);
            decimal sumOfXY_d = Convert.ToDecimal(sumOfXY);
            decimal sumOfXX_d = Convert.ToDecimal(sumOfXX);
            decimal itemsInSet_d = Convert.ToDecimal(itemsInSet);
            decimal slope =
                ((itemsInSet_d * sumOfXY) - (sumOfX_d * sumOfY_d))
                /
                ((itemsInSet_d * sumOfXX_d) - (sumOfX_d * sumOfX_d));

            slopes.AddLast(Convert.ToDouble(slope));
        }
    }

    double mean = slopes.Average();
    double sumOfSquares = slopes.Sum(d => Math.Pow(d - mean, 2));
    double stddev = Math.Sqrt(sumOfSquares / (slopes.Count - 1));

    // select items within 1 standard deviation of the mean
    var testSample = slopes.Where(x => Math.Abs(x - mean) <= stddev);

    return ToDegrees(testSample.Average());
}

class BrightnessWrapper
{
    byte[] rgbValues;
    int stride;
    public int Height { get; private set; }
    public int Width { get; private set; }

    public BrightnessWrapper(Bitmap bmp)
    {
        Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);

        System.Drawing.Imaging.BitmapData bmpData =
            bmp.LockBits(rect,
                System.Drawing.Imaging.ImageLockMode.ReadOnly,
                bmp.PixelFormat);

        IntPtr ptr = bmpData.Scan0;

        int bytes = bmpData.Stride * bmp.Height;
        this.rgbValues = new byte[bytes];

        System.Runtime.InteropServices.Marshal.Copy(ptr,
                       rgbValues, 0, bytes);

        this.Height = bmp.Height;
        this.Width = bmp.Width;
        this.stride = bmpData.Stride;
    }

    public int GetBrightness(int x, int y)
    {
        int position = (y * this.stride) + (x * 3);
        int b = rgbValues[position];
        int g = rgbValues[position + 1];
        int r = rgbValues[position + 2];
        return (r + r + b + g + g + g) / 6;
    }
}

The code is good, but not great. Large amounts of whitespace cause the program to draw relatively flat line, resulting in a slope near 0, causing the code to underestimate the actual tilt of the image.

There is no appreciable difference in the accuracy of the tilt by selecting random sample points vs sampling all points, because the ratio of "flat" paths selected by random sampling is the same as the ratio of "flat" paths in the entire image.

like image 112
Juliet Avatar answered Sep 20 '22 14:09

Juliet


GetPixel is slow. You can get an order of magnitude speed up using the approach listed here.

like image 23
Vinko Vrsalovic Avatar answered Sep 21 '22 14:09

Vinko Vrsalovic