C# - Optical Marks Recognition (OMR) Engine 1.0

LamYongXian

4.65/5 (18 votes)

23 Dec 2013CPOL6 min read

49.4K

This is an alternative for C# - Optical Marks Recognition (OMR) Engine 1.0

Introduction

OMR ( wiki ) are answer sheets that are not intended to be read by a human being. This projects eliminates the need to buy OMR reading machines and even a photo scanner for computer. Any 2MP< mobile phone camera with autofocus will do the job.

Background

I Googled to to find a good OMR engine but in vain, so I decided to make my own. Normally in schools and colleges they use specialized machines to read OMR answer sheets. In my case I needed to eliminate the need of buying the OMR reader machine, even the scanner! Pictures taken with a 2MP mobile phone camera (autofocus required) can be read through this engine. At an initial level, I have created my own sheet types that can be read. The image processing part utilizes AForge.Net's image processor (libraries included in the ¬ download).

[EDITED/ADDED] Will it work with other OMR sheets?

It looks like a lot of people have asked a same question repeatedly in the feedback.

"Can we use any other OMR sheet with this engine? " .

And the answer is, "NO".

Why? Because of the Paper image extraction Algorithm used in method ExtractPaperFromFlattened(). It identifies a paper out of a scanned immage by finding the 4 crossed circular symbols located on ¬ the paper. THESE SYMBOLS DEFINE THE BOUNDARIES OF PAPER AND THUS, THE INFORMATION ABOUT CROPPING, RESCALING ¬ AND SKEWING OF PAPER IS ESTIMATED.

So, no symbols, NO DETECTION of paper.

[EDITED/ADDED] Adding to ASP.NET Website as Reference Project in VB (also work in C#)

In order to use this project under ASP.NET, just add the project OMR in the source package into your ASP.NET Visual Studio Solution as a existing project.

Then, create a reference from your website project using Add Reference -> project. During compilation of solution, a OMR.DLL will appear in the Bin directory.

For usage, in the aspx code behind file (using VB.NET):

VB.NET

'Import OMR project namespace
Imports OMR
Imports OMR.XML

VB.NET

'Loading of left and right symbol images 
Dim compImgRight As System.Drawing.Image = System.Drawing.Image.FromFile(Server.MapPath(".") & "/rc.jpg") 
Dim compImgLeft As System.Drawing.Image = System.Drawing.Image.FromFile(Server.MapPath(".") & "/lc.jpg")

VB.NET

'URL of Specification XML Sheet
Dim strXML As String = Server.MapPath(".") & "/sheets.xml"

'Creating the OpticalReader object 
Dim reader As OpticalReader = New OpticalReader

VB.NET

Dim answerSheetbm As Bitmap = Nothing
If fileUpload1.HasFile Then 'Using FileUpload Control

	'Save to Temp Location
	fileUpload1.SaveAs(Server.MapPath(".") & "/answersheet.png")
	answerSheetbm = New Bitmap(System.Drawing.Image.FromFile(Server.MapPath(".") & "/answersheet.png")) 

	'Resizing bitmap to expected proportion
	ImageUtilities.ResizeImage(answerSheetbm, 210, 210 * answerSheetbm.Height / answerSheetbm.Width) 

	'Extract relevant bitmap based on left/right symbols 
	answerSheetbm = reader.ExtractOMRSheet(answerSheetbm , strXML , OMREnums.OMRSheet.A550, compImgLeft, compImgRight) 

End If

To read the Reg Number simply call:

Dim strRegNum As String = reader.getRegNumOfSheet(answerSheetbm, OMREnums.OMRSheet.A550, strXML, False).ToString()  '000-999

To read each OMR Answer row into string, simple use reader.rateSlice() and reader.sliceOmarkBlock() methods of in the OpticalReader object:

VB.NET

 Dim blocks As System.Drawing.Rectangle() = New System.Drawing.Rectangle() { _
OMRSheetReader.GetSheetPropertyLocation(strXML, OMREnums.OMRSheet.A550, OMREnums.OMRProperty.tensBlock1), _
OMRSheetReader.GetSheetPropertyLocation(strXML, OMREnums.OMRSheet.A550, OMREnums.OMRProperty.tensBlock2), _
OMRSheetReader.GetSheetPropertyLocation(strXML, OMREnums.OMRSheet.A550, OMREnums.OMRProperty.tensBlock3), _
OMRSheetReader.GetSheetPropertyLocation(strXML, OMREnums.OMRSheet.A550, OMREnums.OMRProperty.tensBlock4) }

VB.NET

For Each rec As System.Drawing.Rectangle In blocks

	Dim blk As Bitmap() = reader.SliceOMarkBlock(answerSheetbm, rec, 10)
	For Each line As Bitmap In blk
		Dim num As Integer = reader.rateSliceMax(line, 5) '0,1,2,3,4,5
	Next

Next

Note: You can modify the OMR project to replace Messagebox.show() methods with Throw new Exception() if you do not want to have pop-ups appearing during debugging.

The original reader.getRegNumOfSheet() also contain a bug whereby Reg Number '9' will be read as '900' if the second and third rows are not shaded.

I will upload the edited version of this project version later on. Great project umar.techBOY!

Using the Code

After adding all the references (AForge and OMR) you can use the simplest method overload to ¬ extract an OMR wrapped sheet from camera/scanner image.

Raw images must contain one clear view of the supported sheet formats (Printable PDF is in the ¬ download). e.g.

Bitmap unf = new Bitmap(panel1.BackgroundImage);
OpticalReader reader = new OpticalReader();
panel1.BackgroundImage = (System.Drawing.Image)reader.ExtractOMRSheet(unf,
"sheets.xml" , OMREnums.OMRSheet.A550);

This will extract sheet as this:

And once the sheet is extracted you can process it using methods like:

OpticalReader rr = new OpticalReader();
MessageBox.Show(rr.getRegNumOfSheet(panel1.BackgroundImage,
OMREnums.OMRSheet.A550, "sheets.xml",false).ToString());

Sheet Detection From Camera/Scanner Image

The procedure to detect a sheet involves detection of a sheet's corners. In the printed document corners are marked with specific binary images. We detect them, we detect a sheet.

So, first of all, we need to flatten out the picture using correct contrast, fill, threshold and ¬ invert filter. As a start point, raw images with no contrast, brightness or fill correction are inverted. Given a threshold, the image is then converted to binary. This image is called a "Flattened Image" and is obtained by using the ¬ "OMR.OpticalReader.flatten" method.
Once an image is flattened out, blob detection starts. In the first stage all sizes and kinds of blobs starting from a minimum blob size are detected ¬ (this ensures we remove noise grains blobs)
The left edge is detected first and then we detect the right one edge of sheet.
Out of hundreds of blobs detected out of a picture in first filter, the wrong sized blobs are ¬ filtered out by checking their size to camera/scanner image's size ratio.
In the second filter, blobs placed on the wrong side of the image are filtered out
In the third filter, blobs having insanely wrong aspect ratios are filtered out (ensuring we detect ¬ and reject the blobs produced by bends/lines on sheet)
As a last filter to blobs, all the blobs are compared against a mirrored corner image (mirrored ¬ because we inverted the image in first step).
Filtered out blobs are once again, re-verified that they are exactly four in quantity and are ¬ placed on the right sides of sheet. Also, left and right edges don't vary too much in their lengths.
Verified blobs represent the real location of sheet corners in image coordinate system.
Images can be cropped through of these points from the unflattened image and wrapped to produce a ¬ perfect rectangular image called OMR Sheet.
If all the above filters yield only 4 corner blobs, the process is continued, otherwise a recursive ¬ call is made on the same function with the same parameters, but with altered contrast correction ¬ that may yield in a better results value.

See the code with line-line comments on the "OMR.OpticalReader.ExtractOMRSheet" method.

Reading the Extracted Sheet

Major image processing lied in the image extraction part. Now the next stage is to read the OMR sheet.

Normally OMR sheets have multiple choices to one question. All the options to same question are printed aside on the paper, forming a "block". All the blocks's locations, sizes and numbers of options given are enough to save it to XML file. Location is recorded according to a coordinate system, usually followed in .NET i.e. upper left corner as O(0,0) (x,y) +ive x asis towards right side, +ive y towards bottom.

To read a specific block in the sheet, (sheet means the extracted sheet in the first section), ¬ OMR.OpticalReader.getScoreOfSheet can be called. This method executes the above procedure repeatedly to read all the lines in all of the 4 ¬ BigAnswerBlocks printed on the sheet.

Reading The Selected Option Out of the Given Choices

When a multiple choice selection block is sliced out of the document it's time to read the selected ¬ choice. For reading, the block is divided into as many equal parts as there are options present in it. Then the image is converted to binary, based on the mean color of the block. This is how we convert white papers to pure white and more than half the ink-filled pixels to pure ¬ black.

The black pixel count is recorded on each subdivision of block.

The darkest block is compared with other blocks and if a remarkable difference exists between the ¬ two sub divisions, the darker one is recorded as "Marked". Depending upon the number of Marked choices it can be decided which option was selected.

Note

Take a look at the other methods also. The methods can read all the choices from a single paper sheet in one method call, and create an ¬ XML specification sheet for two kinds of sheets given.

The heart of the camera image sheet recognizer lies in the following method.

Points of Interest

Now, what's next is to make an application that takes a Folder full of answer sheets for a test ¬ paper conducted in a class of 50 students or more. Or, have the address of a scanner attached to a PC and one after another start processing images. Depending upon the registration number written on sheet, the program should create an XLS file a ¬ PDF so that result is compiled totally electronically.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)