Introduction
Microsoft Office Document Imaging Library (MODI) which comes
with the Office 2003 package, allows us easily integrate OCR functionality
into our own applications. Although there is a good C# sample:
"OCR with Microsoft®
Office" posted on this web site, I would need
something in C++. After searching on the Internet and the Microsoft web
site and can't find anything good regarding MODI's OCR for Visual C++.
I decided to dig this thing out and write this sample demo program to
show the basic thing of MODI's OCR feature. I believe that some people may
be interested in this program, so, I post it on the codeproject web site
to share the common interest.Project Background
This project was firstly started in Visual C++ 6.0 and then updated to Visual
Studio .Net 2003 and I have included two project file in the demo program. To
run it in Visual C++ 6.0, open MODIVCDemo.dsp manually.
Build Project and Use Code
Add MODI Active-X into the project
In visual C++ 6.0, click "Project->Add To Project->Components and
Controls->Registered ActiveX Control" and select MODI ActiveX as
shown below.
Mapping Active-X into the project
Once map MODI Active-X control into the project, all Active-X
control wrapped classes will be automatically added into the project.
HOW TO OCR it in Visual C++.
Following is the sample code showing how to use MODI for OCR.
BOOL CMODIVC6Dlg::bReadOCRByMODIAXCtrl(CString csFilePath,
CString &csText)
{
BOOL bRet = TRUE;
HRESULT hr = 0;
csText.Empty();
IUnknown *pVal = NULL;
IDocument *IDobj = NULL;
ILayout *ILayout = NULL;
IImages *IImages = NULL;
IImage *IImage = NULL;
IWords *IWords = NULL;
IWord *IWord = NULL;
pVal = (IUnknown *) m_MIDOCtrl.GetDocument();
if ( pVal != NULL )
{
pVal->QueryInterface(IID_IDocument,(void**) &IDobj);
if ( SUCCEEDED(hr) )
{
hr = IDobj->OCR(miLANG_SYSDEFAULT,1,1);
if ( SUCCEEDED(hr) )
{
IDobj->get_Images(&IImages);
long iImageCount=0;
Images->get_Count(&iImageCount);
for ( int img =0; img<iImageCount;img++)
{
IImages->get_Item(img,(IDispatch**)&IImage);
IImage->get_Layout(&ILayout);
long numWord=0;
ILayout->get_NumWords(&numWord);
ILayout->get_Words(&IWords);
IWords->get_Count(&numWord);
for ( long i=0; i<numWord;i++)
{
IWords->get_Item(i,(IDispatch**)&IWord);
CString csTemp;
BSTR result;
IWord->get_Text(&result);
char buf[256];
sprintf(buf,"%S",result);
csTemp.Format("%s",buf);
csText += csTemp;
csText +=" ";
}
IWord->Release();
IWords->Release();
ILayout->Release();
IImage->Release();
}
IImages->Release();
} else {
bRet = FALSE;
}
} else {
bRet = FALSE;
}
IDobj->Close(0);
IDobj->Release();
pVal->Release();
} else {
bRet = FALSE;
}
return bRet;
}
That is!
Version History
Version 1: No Active-X ctrl in the dialogue, use bReadOCRByMODI(...)
Version 2: Add Active-X ctrl in the dialogue,use bReadOCRByMODIAXCtrl(...)
and version 2 is the 1st demo program posted on the codeproject.