View PDF files in C# using the Xpdf and muPDF library, Print PostScript.

Antonio Sandoval

4.78/5 (39 votes)

26 Nov 2010GPL34 min read

1.7M

311

Wrapper C# class written in C++CLI and a sample implementation in C# to render PDF files.

Download code from SVN (Code Google)

Introduction

Xpdf is an Open Source library released under GPL license; they have an ActiveX with commercial license, but some time ago, before I knew about this commercial control, I wrote this wrapper library to render PDF files in C#.

Background

The basic idea is create a preview of PDF files in C#. After looking at many places in the internet, I found this wonderful library; the only problem is that the library uses XLib, and there is no XLib available for Windows. Fortunately, Xpdf can render the generated PDF into a Win32 DC.

Writting the wrapper

C++\CLI can mix managed and unmanaged code thanks to the IJW technology, so I was thinking that maybe I could to link the xpdf lib to the wrapper. The problem is that xpdf has some classes that are also declared in .Net, the solution was compile a C++ project with a class that includes only the necessary files to do the interop.
This library (AFPDFLib) contains a simple class that works like a proxy between C++ and C++\CLI, keeping the xpdf objects into the unmanaged Heap.
The C# wrapper is linked to AFPDFLib statically, and this only includes:
AFPDFDocInterop.h
OutlineItemInterop.h
SearchResultInterop.h
The classes:
AFPDFDoc -> Implement the methods that needs xpdf.
AFPDFDocInterop -> Write the methods to wrap into C#
PDFWrapper -> Wrapped methods

Marshal Strings:

IntPtr ptr = Marshal::StringToCoTaskMemAnsi(fileName);
char *singleByte= (char*)ptr.ToPointer();
try{
}finally{
     Marshal::FreeCoTaskMem(ptr);
}

For releasing resources is necessary implement IDisposable:

!PDFWrapper()
{
   _pdfDoc->Dispose();
}

Using the code

The file xpdfWin-Interop.sln includes all the necessary files, you can also download the last version from http://www.foolabs.com/xpdf/ and recompile without the files that requires XLib.

The Build Project Order is as follows: freetype,xpdf,AFPDFLib, PDFLibNet. Once compiled PDFLibNet, it can be used in C# code:

OpenFileDialog dlg = new OpenFileDialog();
dlg.Filter = "Portable Document Format (*.pdf)|*.pdf";
if (dlg.ShowDialog() == DialogResult.OK)
{
    _pdfDoc = new PDFLibNet.PDFWrapper();
    _pdfDoc.LoadPDF(dlg.FileName);
    _pdfDoc.CurrentPage = 1;

   PictureBox pic =new PictureBox();
   pic.Width=800;
   pic.Height=1024;
   _pdfDoc.FitToWidth(pic.Handle);
   pic.Height = _pdfDoc.PageHeight;
   _pdfDoc.RenderPage(pic.Handle);
   
   Bitmap _backbuffer = new Bitmap(_pdfDoc.PageWidth, _pdfDoc.PageHeight);
   using (Graphics g = Graphics.FromImage(_backbuffer))
   {
       _pdfDoc.RenderHDC(g.GetHdc);
       g.ReleaseHdc();
   }
   pic.Image = _backbuffer;
}

It is necessary create a PictureBox because the class implements only a method that accepts an HWND, because in the first instance, I was trying to implement the scroll into the same control that the PDF is rendered. In the included sample, the scroll is controlled by a Panel container.

Xpdf can export the PDF to a PostScript file. For printing this is the best option if you have a PostScript Printer:

PSOutputDev *psOut =new PSOutputDev((char *)fileName,m_PDFDoc->getXRef(),m_PDFDoc->getCatalog(),fromPage,toPage,psModePS);
if(psOut->isOk()){
    m_PDFDoc->displayPages(psOut,fromPage,toPage,PRINT_DPI,PRINT_DPI,0,gTrue,globalParams->getPSCrop(),gTrue);
}
delete psOut;

The file must be sended in RAW format (http://support.microsoft.com/kb/322091)

JPG Export

For async export:

 _doc.ExportJpg(filename, 
1,        //From page
1,        //To page
150,      //Resolution in DPI
90,       //Jpg quality

if you need a sync operation its posible especify a wait time:

 _doc.ExportJpg(filename, 
1,        //From page
1,        //To page
150,      //Resolution in DPI
90,       //Jpg quality
-1);      //Time to wait, -1 to infinite.

If the file name does not contains a %d token (for the page number), then the procedure replaces .jpg with -page%d.jpg. PDFWrapper exposes two events ExportJpgProgress and ExportJpgFinished. Both events are called from the exporting Thread, so it is necesary to make a security call using Invoke, check frmExportJpg for a sample.

History:

06\July\2009:

Full deployed solution.
Updated to xpdf 3.0.2 version.
FreeType updated to 2.3.1
When click in a bookmark and search, the page scroll to the correct position.
PostScript implemented.
Now gets the Title, Author.

08\July\2009

Some memory leaks corrected
Prerender next page in new thread
Cache of pages
Mouse Scrolling
Mouse Navigation
Load links from page (LinkURI, LinkGoTo)

11\July\2009

Using DIB Sections, fixes the problem with the zoom.
Added control PageViewer, now render only the viewable area
Open password protected files.
Export to txt
Export to jpg

12\July\2009

Added support for Unicode in Bookmarks, title, subject, keywords...
Added support for named destinations

13\July\2009

Fixed some bugs.
Added support for unicode search

20\July\2009

Multithread jpg export
Fixed others bugs

07\Nov\2009

Added MuPDF as second renderer.

26\NOV\2010

Added support for swf convert using swftools (http://www.swftools.org/)
Added support for html convert using html2text (http://pdftohtml.sourceforge.net/), I modified the HtmlOuputDev because originally it uses ghoscript to render the background. Now it uses SplashOutputDev.

Know issues: MuPDF has some problems with transparency, but is faster than xpdf. A couple of memory leaks.

IMPORTANT:

MuPDF uses recursion for analyze the tree document, so is necessary increment the Stack Size to at least 4mb to avoid problems with some complex files (editbin for C#, VB.Net exe's). Soon or later the recursion causes an stack overflow if the tree is so big, so while it is fixed that is the most important issue.

To Do:

- Apply last xpdf patche - Show multiple pages in the viewer. - Improve user interface. - Implement LoadFromStream for MuPDF. There is missing some functionality that can be extracted from xpdf: - Enable selection, image extraction and instant snapshot. - Print in non PostScript printers.

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)