Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Display Chinese Characters in PDF created by iTextSharp

0.00/5 (No votes)
13 May 2011 1  
The iTextSharp API is a powerful Open Source tool for creating PDF documents on the fly; it has the ability to generate multilingual PDF but it does not come with the default setting. In order to display Unicode characters such as Simplified Chinese, there are some simple but not-well-documented tri

Introduction

I have been using iTextSharp for sometime now but had not been able to create a dynamic PDF that can display both English and Chinese characters successfully until today. There are not many writings out there that would give me a complete coverage on how this is done, so I figured it will be beneficial to folks who are still struggling like I was that I summarize all the important steps in one post for easier reference on this tricky issue. Even though my case only deals with Simplified Chinese characters, the same technique is applicable to any double-byte Unicode character set, I believe.

There is plenty of documentation about how to use the iTextSharp API to create dynamic PDF, so here I will only focus on what is generally missing in those web posts – getting Chinese or Asian characters in general to display correctly alongside with English ones.

Getting the API

Get the latest version of iTextSharp.dll from here. Once downloaded and unzipped, there is only one file itextsharp.dll; I added a reference to it in my C# project and that was it. I also downloaded iTextAsian.dll and iTextAsianCmaps.dll (from the Extras link on the same download page for iTextSharp.dll) and added references to them, but it turned out I did not need them at all for what I wanted to accomplish, so later I removed them from the project.

Getting the Font File

This was a turning point for me and everything turned out just fine after I downloaded the correct TTF file for the Simplified Chinese characters I wanted to display. There are many font files for the Simplified Chinese language (used mostly in mainland China); my eyes almost dried up after exhaustive Google searches, and I used a trick in Microsoft Word to help me sort this out – I opened Word, switched the input language to Simplified Chinese, and started typing in a few Simplified Chinese letters into the document. When I looked at the Font faces dropdown box on the top-left corner of the menu bar, it automatically changed from the default “Calibri” to “SimSun”, so that settled it – “SimSun.ttf” was what I wanted to get. I then simply typed “simsun.ttf” on the Google search box and picked the top one from the results: http://www.gamefront.com/files/16488629/simsun.ttf/. There is no need to install the font file. I just placed it in a “Fonts” folder (you can name it whatever) inside my ASP.NET project that is accessible by the runtime ASP.NET code.

Generating a Mixed-Language PDF

Now, it’s show time. In this theme, I have an ASP.NET grid view that contains both English and Chinese texts that are pulled from a SQL Server 2008 database. This is a calendar displaying school events and schedules; so it is natural that a “Save as PDF” link is displayed here to allow the user to grab a copy of the school calendar to go. It was quite straightforward to accomplish this, if there were no Chinese characters involved - I wrapped the grid view data into a DataTable object and passed it to a pre-written function called ExportPdfTable that in turn made the appropriate iTextSharp API calls to carry out the dirty work. Below is the code snippet that traverses the DataTable object and spits out a PDF document on the fly:

using iTextSharp.text
using iTextSharp.text.pdf; 
..

/// <summary>
/// Print a table to a Pdf file in tabular format,
/// with header image and text on every page
/// </summary>
/// <param name="PdfFileName">This is pyhsical file name
/// of Pdf document that you wanto write to</param>
/// <param name="dt">Datatable that contain the data to print</param>
/// <param name="DocTitle">This is the title of the
/// document to be printed once on first page</param>
/// <param name="PageHeader">Header text to go
///    with header image on every page</param>
/// <param name="HeaderImgPath">the physical file path of a image
/// that you want to print out on every page header</param>
/// <returns></returns>        
public static bool ExportPdfTable(string PdfFileName, DataTable dt, 
       string DocTitle,string PageHeader,string HeaderImgPath)
{
    Document doc = new Document(); //iTextSharp document

    try
    {
        string physicalFile = PdfFileName;

        PdfWriter pw=PdfWriter.GetInstance(doc, 
                       new FileStream(physicalFile, FileMode.Create));

        //prepare for inserting header on every page, using Pageevent
        //DocumentEevent e = new DocumentEevent(HeaderImgPath, PageHeader);
        //pw.PageEvent = e;

        doc.Open();

        if (PageHeader.Length > 0)
            doc.Add(new Paragraph(new Phrase("")));

        if (DocTitle.Length > 0)
            doc.Add(new Phrase(DocTitle));

        if (dt.Columns.Count == 0) return false;
        int cols = dt.Columns.Count;
        int rows = dt.Rows.Count;

        //prepare the table object
       PdfPTable t = new PdfPTable(cols);
                           
       t.WidthPercentage = 100;

        //cell object
        PdfPCell c;

        //Use BaseFont to load unicode fonts like Simplified Chinese font
        string fontpath = System.Web.HttpContext.Current.Request.PhysicalApplicationPath + 
                          "\\includes\\fonts\\simsun.ttf";

        //"simsun.ttf" file was downloaded from web and placed in the folder
        BaseFont bf = BaseFont.CreateFont(fontpath,BaseFont.IDENTITY_H, 
                                          BaseFont.EMBEDDED);
                 
        //create new font based on BaseFont
        Font fontContent = new Font(bf, 11);
        Font fontHeader = new Font(bf, 12);

        //write header
        for (int j = 0; j < cols; j++)
        {
            Phrase pr=new Phrase((dt.Columns[j].Caption != null && 
              dt.Columns[j].Caption.Length > 0) ? dt.Columns[j].Caption : 
              dt.Columns[j].ColumnName,fontHeader);

            c = new PdfPCell(pr);
            c.PaddingBottom = 5f;
            c.PaddingTop = 5f;
            c.PaddingLeft = 8f;
            c.PaddingRight = 8f;
            t.AddCell(c);
        }
        //write table content
        for (int i = 0; i < rows; i++)
        {
            for (int j = 0; j < cols; j++)
            {
                //c = new Cell(dt.Rows[i][j].ToString());
                //c.Header = false;
                c = new PdfPCell(new Phrase(dt.Rows[i][j].ToString(),fontContent));
                c.PaddingBottom = 5f;
                c.PaddingTop = 5f;
                c.PaddingLeft = 8f;
                c.PaddingRight = 8f;

                t.AddCell(c);
            }
        }
        return doc.Add(t);
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message);
        return false;
    }
    finally
    {
        doc.Close();
    }
}

The four lines that are critical for allowing Chinese characters to show up in the dynamically generated PDF are shighlighted here:

string fontpath = 
  System.Web.HttpContext.Current.Request.PhysicalApplicationPath + 
  "\\includes\\fonts\\simsun.ttf";
//"simsun.ttf" file was downloaded from web and placed in the folder
BaseFont bf = BaseFont.CreateFont(fontpath,
                 BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
        
//create new font based on BaseFont
Font fontContent = new Font(bf, 11);
Font fontHeader = new Font(bf, 12);

If these four lines are taken out, the function would still output the same good-looking PDF file except that all Chinese characters would disappear completely. The key in creating a correct base font in this case is loading the correct TTF file and setting the encoding to BaseFont.IDENTITY_H. I tried “UTF-8” and some other values, but none of them seemed to work. I got this tip from a very brief but helpful post at http://stackoverflow.com/questions/1727765/itextsharp-international-text and I am grateful to its author StewBob and to stackoverflow.com that hosted the post.

Deployment

Everything ran perfectly on my local development machine... the PDF file got created and the Chinese characters showed up...but a "That assembly does not allow partially trusted callers." error was thrown after the project was deployed to an external hosting site. What was going on? Well, this was actually not a new problem and had nothing to do with the Chinese font. As it turned out, I was not supposed to just simply upload itextsharp.dll to the hosting server. I needed to download the iTextSharp source files and compile the DLL with some additional security attribute added. So these extra steps were taken to fix this DLL error:

  1. Downloaded the iTextSharp source files (version 5.xx) from http://sourceforge.net/projects/itextsharp/files/itextsharp/.
  2. Launched the iTextSharp solution and opened AssemblyInfo.cs and added the following lines to the end of the file:
  3. [assembly: AssemblyDelaySign(false)]
    [assembly: AssemblyKeyFile("")]
    [assembly: AssemblyKeyName("")]
    //added this atribute to allow the dll to be callable on a hosting website
    [assembly: AllowPartiallyTrustedCallers()]
  4. Compiled the DLL and copied it to the hosting site's bin folder, and done.

Summary

Wrapping it up, the four key steps are:

  1. Download the correct font file and place it in a folder inside the web project.
  2. Create a BaseFont with the font file loaded and set the encoding to BaseFont.IDENTITY_H; also set embedding the font in the PDF document to true (BaseFont.EMBEDDED).
  3. Create new fonts based on the BaseFont and use them in Paragraph or Phrase constructors.
  4. Make sure when iTextSharp.dll is deployed to the remote hosting server, the DLL must be re-compiled with the "AllowPartiallyTrustedCallers" attribute set.

History

  • First submitted on 5/12/2011.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here