Usually scanned and photographed documents are skewed.
I would like to get rid of these skews, and do it by a simple and fast way, without heavy-weight libraries like OpenCV, AForge, etc.
|
|
Idea of the Algorithm
Let's divide the original image into vertical strips and calculate the average brightness of the pixels in each line of each strip:
Each line of text leaves a characteristic dark line. If there was no skew, the dark lines in the right strip would coincide with the lines in the left strip. But as a result of the skew, lines are shifted:
Now, if we found such a shift, in which the maximum number of lines in strips coincide, we can calculate the angle of original image to remove skew.
Required shifting can be found if we shift one of the strips on 1, 2, 3, etc. pixels by vertical and calculate the correlation coefficient between the strips (intercorrelation function). Where the correlation reaches the maximum value - the shifting is optimal.
Implementation
Firstly, it is necessary to find the average values of the pixels in the vertical strips of the source image. It is a time-consuming task because the source image can be large, and enumeration of all pixels - a long process.
To do this quickly, we make the trick - use the built-in GDI+ mechanism. We will create a new bitmap whose height is equal to the height of the original image, and the width - equals 2 pixels.
So, we draw the source image on the new bitmap, compressing the image horizontally to 2 pixels:
using (var bmp = new Bitmap(2, sourceImage.Height))
using (var gr = Graphics.FromImage(bmp))
{
gr.InterpolationMode = InterpolationMode.Low;
gr.DrawImage(sourceImage, 0, 0, 2, sourceImage.Height);
...
}
Because GDI+ uses interpolation when decreasing the width of image, each pixel of the new bitmap will be equal to average of pixel values for each line of each strip.
As the quality of interpolation does not matter, we choose the fastest interpolation mode - InterpolationMode.Low
.
At this stage, the image can be a little compressed also vertically to reduce the length of the strips and to increase performance of further calculations.
Next, we need to move the brightness of found pixels in an array int[]
for each strip. To do this, use method Bitmap.LockBits
for fast access to pixels. The brightness of the pixel can be calculated by Color.GetBrightness()
, but I used a faster analog - just take the green color channel Color.G
as brightness.
After we received the averaged strips, we will calculate cross-correlation function between the two:
var sum = 0;
for (int i = 0; i < strip1.Length - shift; i++)
sum += -Math.Abs(strip1[i] - strip2[i + shift]);
Instead of classical correlations, I simply use the sum of absolute differences between strips. It is faster and produces better results.
After these calculations for different values of offset, we will find a shift, where the correlation reaches the maximum.
Now we need only to find the angle of rotation. To do this, we create a vector between the centers of the strips, taking into account the found shift. The horizontal distance between the centers of the strips will:
var dx = sourceImage.Width / 2;
var dy = shift;
Now we can find the angle of rotation:
var angle = Math.Atan2(dy, dx);
Finally, we rotate the source image:
var rotated = new Bitmap(sourceImage.Width, sourceImage.Height);
using (var gr = Graphics.FromImage(rotated))
{
gr.InterpolationMode = InterpolationMode.HighQualityBicubic;
gr.TranslateTransform(sourceImage.Width / 2, sourceImage.Height / 2);
gr.RotateTransform(-(float)angle);
gr.DrawImage(sourceImage, -sourceImage.Width / 2, -sourceImage.Height / 2,
sourceImage.Width, sourceImage.Height);
}
I use the highest level of interpolation InterpolationMode.HighQualityBicubic
. It makes the rotation more slowly, but we get perfect quality of the result image.
Save the image to a file. Enjoy.
Remarks
Above, I've described a simplified implementation. But the real code is a bit more complicated:
- Actually the image is divided in 10 strips (not 2). The narrower the strip, the more accurate we can calculate the offset.
- To find the most accurate results, I compare several pairs of strips. It allows to avoid the unsuccessful cases, when the strip gets into an empty area of the document.
- The shift is searched both up and down.
- To increase performance, when calculating the strips, the original image is compressed to 600 pixels vertically (or less).
- The most time consuming operation - turn the original image by a predetermined angle. In order to do this quickly - you can either reduce the size of the original image or reduce the quality of interpolation.
- This algorithm is not designed for keystone correction (perspective distortion, for example).
Results