Introduction
PDFRasterizer.NET is a component for rendering PDF documents and is written entirely in C#. It has no dependencies other than the .NET framework and it is packaged as a single assembly which makes deployment truly easy. The component draws to any System.Drawing.Graphics object. Because a Graphics object may represent a raster image, an Enhanced Metafile, a printer or the surface of your Windows form or control, this is the most generic and powerful render target that we could have chosen. It does not introduce any vendor specific image classes or viewing controls that bring their own idiosyncrasy and lock you in.
This article describes how to use the PDFRasterizer.NET component to:
- convert PDF documents to raster images such as BMP, GIF, PNG, etc.
- display PDF documents in your WinForms application (with and without an EMF)
- programmatically print PDF documents
An Overview the PDFRasterizer.NET Classes
The PDFRasterizer.NET object model is very simple and consists of just 4 classes: Document
, Pages
, Page
and ExtractedImageInfo
. ExtractedImageInfo
is used to extract images from the PDF document. See method Page.ExtractImages() in the class reference for details.
Document
This is the top-level class that represents the PDF document that you wish to render. You construct it like this:
using ( FileStream file = new FileStream( "some.pdf", FileMode.Open,
FileAccess.Read ) )
{
document = new Document( new BinaryReader( file ) );
... now you can enumerate the pages and draw them
}
An overload is available to provide a password. Document
has read-only properties to retrieve document information such as the Author, Subject, etc. The property Pages
returns the collection of pages inside this document.
Pages
This is the collection of pages inside the PDF document as returned by Document.Pages
. It has a property to retrieve the number of pages and to retrieve a page by index.
Page
Not surprisingly, this class represents a single PDF page and is the most interesting class. The following code demonstrates how to enumerate the pages:
Document document;
... see previous code for how to obtain a document object
for ( int pageIndex = 0; pageIndex < document.Pages.Count; pageIndex++ )
{
Page page = document.Pages[ pageIndex ];
... see further for how to draw the page
}
Once you have a Page
object, you can tell it to draw to a System.Drawing.Graphics
object:
Page page;
System.Drawing.Graphics graphics;
page.Draw( graphics );
That looks pretty simple doesn't it? The key part is how you create the Graphics
object. This depends on the type of render target that you choose. In the next sections I will describe different ways to instantiate a Graphics object.
Convert PDF to a Raster Image (BMP, PNG, TIFF, etc.)
See the ConvertToImage_cs and ConvertToImage_vb samples that are included with the PDFRasterizer.NET application for full source code.
PDF-to-image converter sample (included with the PDFRasterizer.NET installation)
The conversion of a PDF page to a raster image involves the following steps:
- Create a Bitmap object at the correct resolution
- Wrap a Graphics object around the bitmap
- Apply a scale transform to the Graphics object to account for the resolution
- Draw the Page to the Graphics object
- Save the Bitmap using any GDI+ supported image format
using ( FileStream file = new FileStream( "test.pdf", FileMode.Open,
FileAccess.Read ) )
{
Document document = new Document( new BinaryReader( file ) );
Page page = document.Pages[0];
float dpi = 300;
float scale = dpi / 72f;
using ( Bitmap bitmap = new Bitmap( (int) ( scale * page.Width ),
(int) ( scale * page.Height ) ) )
{
Graphics graphics = Graphics.FromImage( bitmap );
graphics.ScaleTransform( scale, scale );
page.Draw( graphics );
bitmap.Save( "test.bmp", ImageFormat.Bmp );
}
}
The scale transformation deserves some explanation. Suppose you want to render a PDF page that has page size Letter. The size of this format is 8.5 x 11 inch (width x height) which corresponds to 612 x 792 points because an inch has 72 points. If you want to draw this page at a resolution of 72 dots per inch (DPI), then the bitmap would have 612 (8.5 x 72) columns and 792 (11 x 72) rows of pixels. Now suppose that you want to double the resolution to 144 dpi. Now the bitmap should have 8.5 X 144 = 1224 columns and 11 x 144 = 1584 rows of pixels. Hence the scale that is applied to the width and height arguments of the Bitmap constructor.
When the Page.Draw
is executed, PDFRasterizer.NET will execute a sequence of method calls to the Graphics
object. It does this in its own coordinate system in which a unit corresponds to a point (1/72 inch). In order to make sure that coordinate ( 612, 792 ) in the PDF coordinate space ends up at ( 1224, 1594 ) in the 144 dpi bitmap coordinate space, a scale transform must be applied to the graphics object before the PDF page draws to the Graphics object. Hence the graphics.ScaleTransform() call before drawing the page.
Because you can draw a PDF page to any Graphics
object you can also draw it on the surface of your Windows control. The most obvious place to do this is in the Control.Paint event handler. Typical code to draw a PDF page to the surface of your control looks like this:
Page page;
Panel viewerPanel;
...
private void viewerPanel_Paint(object sender,
System.Windows.Forms.PaintEventArgs e)
{
e.Graphics.Clear( Color.White );
if ( null != page )
{
page.Draw( e.Graphics );
e.Graphics.DrawRectangle( new Pen( Color.Gray ), 0, 0,
(float) page.Width, (float) page.Height );
}
}
Note that normally you would call e.Graphics.TranslateTransfom()
and e.Graphics.ScaleTransform()
before calling page.Draw()
in order to position and size the PDF page.
The drawback of this simple approach is that the PDF page is drawn each time the Paint event is fired. For complex PDF pages this can be very time consuming. A better way is to render the PDF page to an Enhanced Metafile once and then play that EMF file repeatedly in your Paint event handler. This is discussed next.
Use EMF to Record and Play a PDF Page
PDF viewer sample (incuded with the PDFRasterizer.NET installation)
See the ViewPDF_cs and ViewPDF_vb samples that are included with the PDFRasterizer.NET application for full source code.
An enhanced metafile (EMF) can be thought of as a 'recording' of method calls to a Graphics object. Once recorded you can play the method calls any number of times to a Graphics object. As opposed to a raster format such as BMP or PNG, an EMF preserves the vector graphics. This makes it an excellent format for the purpose of repeatedly drawing a PDF page at different zoom levels. You simply apply a ScaleTransfrom according to the zoom level and then play the EMF.
Create an EMF file
The following method creates a System.Drawing.Imaging.Metafile from a Page object.
private Metafile createMetafile( Page page )
{
Metafile metafile = null;
using ( Graphics graphics = this.CreateGraphics() )
{
System.IntPtr hdc = graphics.GetHdc();
metafile = new Metafile( hdc, new Rectangle( 0, 0,
(int) page.Width, (int) page.Height ), MetafileFrameUnit.Point );
graphics.ReleaseHdc( hdc );
}
using ( Graphics metafileGraphics = Graphics.FromImage( metafile ) )
{
metafileGraphics.SmoothingMode = SmoothingMode.AntiAlias; page.Draw( metafileGraphics );
}
return metafile;
}
Play an EMF file
The createMetafile
method is called from the Paint event handler when it requires a metafile to draw the corresponding page. The metafile is created the first time it is needed (lazy) and cached. The following code shows the Pain event handler. The metafiles variable is a member of type Metafile[]
that has been initialized to an array of null references when the PDF document was opened.
Document document; Metafile[] metafiles[];
...
private void viewerPanel_Paint(object sender,
System.Windows.Forms.PaintEventArgs e)
{
if ( null == document ) return;
e.Graphics.Clear( Color.White );
if ( null == metafiles[ selectedPage ] )
{
metafiles[ selectedPage ] = createMetafile(
document.Pages[ selectedPage ] );
}
Page page = document.Pages[ selectedPage ];
float scale = (float) Math.Min( viewerPanel.Width / page.Width,
viewerPanel.Height / page.Height );
e.Graphics.ScaleTransform( scale, scale );
e.Graphics.EnumerateMetafile( metafiles[ selectedPage ],
new Point( 0, 0 ),
new Graphics.EnumerateMetafileProc( MetafileCallback ) );
}
The MatefileCallback
function is bolier-plate code and not relevant to the working of PDFRasterizer.NET. See the sample projects ViewPDF_EMF_cs and ViewPDF_EMF_vb for its implementation.
Print a PDF Page
See the PrintPDF_cs and PrintPDF_vb samples that are included with the PDFRasterizer.NET application for full source code.
Print PDF sample (included with the PDFRasterizer.NET installation)
Printing involves the following steps:
- create a PrintDocument instance
- set the PrinterSetting and PageSettings properties of the PrintDocument (e.g. by showing the PageSetupDialog)
- assign an event handler to the PrintDocument.PrintPage event
- call the PrintDocument.Print method to start printing
- handle the PrintDocument.PrintPage event by drawing to the provided Graphics object
Because in step 5. the application is provided with a Graphics
object to actually draw to the printer, we can pass this Graphics object to the Page.Draw
method. Note that this will preserve all vector graphics on the PDF page. Scan conversion is done by the printer.
The following method is called when the user selects 'Print' from the viewer application. The variable pagesList
is a ListBox
that displays all the pages. The variable selectedPages is an IEnumerator that will be used to print each selected page from top to bottom.
ListBox pagesList; IEnumerator selectedPages;
...
void print()
{
if ( null == document ) return;
selectedPages = pagesList.SelectedIndices.GetEnumerator();
selectedPages.Reset();
if ( selectedPages.MoveNext() )
{
PrintDocument printDocument = new PrintDocument();
printDocument.DocumentName = document.Title;
PageSetupDialog setupDialog = new PageSetupDialog();
setupDialog.Document = printDocument;
if ( DialogResult.OK == setupDialog.ShowDialog() )
{
printDocument.DefaultPageSettings = setupDialog.PageSettings;
printDocument.PrinterSettings = setupDialog.PrinterSettings;
printDocument.PrintPage += new PrintPageEventHandler( printPage );
printDocument.Print();
}
}
}
After the printDocument.Print()
method is called, the PrintPage
event handler will be called for each page to print. This event handler is implemented as follows:
void printPage( object sender, PrintPageEventArgs e )
{
e.Graphics.PageUnit = GraphicsUnit.Point;
if ( null != selectedPages.Current )
{
int pageIndex = (int) selectedPages.Current;
Page page = document.Pages[ pageIndex++ ];
page.Draw( e.Graphics );
}
e.HasMorePages = selectedPages.MoveNext();
}
The selectedPages
of type IEnumerator
returns the next index. This index is used to select the corresponding Page
from the Pages
collection. Next, this page draws to the provided Graphics
object that represents the printer. Finally the event handlers reports back if there are more pages. This is done by trying to move to the next selected page and returning the result.
Please Send us PDFs that do not Render Correctly
PDF is a very rich format with many different ways to accomplish the same output. In addition there are many producers that each have their own interpretation of the specification. Consequently, you may encounter documents that we do not render correctly. Please send them to support@tallcomponents.com.
TallComponents
You can visit us at http://www.tallcomponents.com?ref=34. Here you can download evaluations for all our PDF-related .NET components and get support.