Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Submitting and Processing PDF Form Data

3 Jan 2016 1  
This article explains the basics of PDF forms. In particular it provides sample code for processing PDF form data in an ASP.NET MVC application.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

If you have to choose between an HTML form and a PDF form - or maybe you are required to support both - then it is good to know about the differences between these two forms and what they have in common. The scope of this article is restricted to classic PDF forms - as opposed to XFA forms.

HTML form

An HTML form looks like this:

<h1>I want pizza!</h1>
<form method="get" action="order">
   <p>Choose a size:</p>
   <input type="radio" name="size" value="small">small<br>
   <input type="radio" name="size" value="medium">medium<br>
   <input type="radio" name="size" value="large">large
   <p>Choose ingredients:</p>
   <input type="checkbox" name="tomatoes" value="tomatoes">tomatoes<br>
   <input type="checkbox" name="onions" value="onions">onions<br>
   <input type="checkbox" name="tuna" value="tuna">tuna<br>
   <input type="checkbox" name="cheese" value="cheese">cheese
   <p>My name:</p>
   <input type="text" name="name" /><input type="submit" value="order" />
</form>

Both the field itself and it how it looks are represented by the same element, namely the input element. PDF on the other hand, separates the notion of a field and its representation completely.

PDF form

PDF fields are defined at document level and each field may have zero or more visual represenations called widgets. Each widgets is associated with a page. The following diagram shows this structure and how document, page, field and widget are related:

Throughout this article I will use the following PDF form:

I used notepad to create the text part and then printed it to PDF. Next, I used Adobe Acrobat Pro DC to add the form elements. The size options are radio buttons that share the same group name "size." The radio buttons "small," "medium" and "large" are part of the same group named "size." The ingredients are checkboxes with corresponding names. Finally there is a textbox named "name" and a button named "order."

Add a submit button to the PDF

In order to submit a PDF form to a web endpoint, you need to add a button with a submit form action. You typically do this in Adobe Acrobat. Here is what the actions tab of the button properties dialog looks like after adding a submit form action:

If you select the action and click the Edit button, you will see the available options for submitting form data:

The selected export format is HTML. This will POST all form data to the specified URL when the button is clicked. Note that in contrast to an HTML form it is not possible to specify GET as the HTTP method. Later on we will see how to handle this request in an ASP.NET MVC application.

Open the PDF form

Opening the PDF forms seems trivial, but it isn't. Let's open this form in the browser and see what happens. You can open it from here: http://www.tallcomponents.com/demos/pizza/form.

As an implementation note, the form is located inside the Content folder of an MVC app and the action method looks like this:

public class PizzaController : Controller
{
   public ActionResult Form()
   {
      return File("~/Content/order-pizza.pdf", "application/pdf");
   }
}

There is a good chance that your browser will render the PDF form itself instead of using the Adobe Reader plug-in. Google Chrome renders the PDF as HTML and breaks a great deal of PDF features, including submitting form data. Edge does the same thing. In fact, all modern web browsers have stopped supporting the NPAPI plug-in infrastructure on which the Adobe Reader plug-in relies. If you click the order button in the browser, nothing happens.

This is why Adobe made it possible to submit form data using the latest versions of Adobe Reader. Earlier versions of Adobe Reader did not allow this unless your document was Reader extended. (If you know exactly when this change entered Adobe Reader, then please leave a comment. I tried to Google it but without success.)

To get the full PDF experience when opening PDF documents or forms from the web, you must disable your browser's PDF viewer. Here are the steps for Google Chrome:

  1. Browse to chrome://plugins
  2. Click the disable link of the Chrome PDF Viewer

(Google for similar instructions for other browsers.)

If you now open the form in your browser using the same link, your default system PDF viewer (make sure it is Adobe Reader) opens the PDF outside the browser like this:

Submit form data from Adobe Reader

Clicking the order button from Adobe Reader submits the form data to endpoint http://www.tallcomponents.com/demos/pizza/order. Here is the ASP.NET MVC controller action that handles this request:

public class PizzaController : Controller
{
   [HttpPost]
   public ActionResult Order(Pizza pizza)
   {
      return View(pizza);
   }
}

Model Pizza:

public class Pizza
{
   public string Size { get; set; }
   public string Tomatoes { get; set; }
   public string Onions { get; set; }
   public string Tuna { get; set; }
   public string Cheese { get; set; }
   public string Name { get; set; }
}

View Order.cshtml:

@model Pizza
<h2>Hi @Model.Name!</h2>
<p>
   Thanks for ordering a @Model.Size pizza.
   Tomatoes: @Model.Tomatoes.
   Onions: @Model.Onions.
   Tuna: @Model.Tuna.
   Cheese: @Model.Cheese.
</p>

Note how MVC takes care of mapping form data to members of Pizza based on their names.

After clicking the order button, the following dialog displays:

After clicking Allow, Adobe Reader asks permission to open the response:

Apparantly, Adobe Reader saves the response to a temporary location. After clicking Yes, the default browser displays the response:

This is as expected but far from a great user experience.

Return a PDF response

The previous use case returned HTML as a response. Consequently, a browser instance opens and displays the HTML. Let's ee what happens if we return the response as PDF.

I have created a second version of the order pizza form that you can open from here: http://www.tallcomponents.com/demos/pizza/form2.

The order button of this PDF submits the data to a second endpoint that returns a PDF response using PDFKit.NET as follows:

[HttpPost]
public ActionResult Order2(Pizza pizza)
{
   Document document = new Document();
   Page page = new Page(PageSize.Letter);
   document.Pages.Add(page);  
   double margin = 72; // points
   MultilineTextShape text = new MultilineTextShape(
      margin, page.Height - margin, page.Width - 2 * margin);
   page.Overlay.Add(text);
   Fragment fragment = new Fragment(
      string.Format("Hi {0}!, thanks for ordering a {1} pizza!", 
         pizza.Name, pizza.Size),
      Font.Helvetica,
      16);
   text.Fragments.Add(fragment);
   // send to browser
   Response.ContentType = "application/pdf";
   Response.AppendHeader("Content-disposition", "attachment; filename=file.pdf");
   document.Write(Response.OutputStream);
   return null;
}

If I now click the order button, a new instance opens showing the following response:

Return flattened PDF form

A PDF form is said to be flattened if all fields have been replaced by non-editable graphics corresponding to the form data. Note that the fields have not just been disabled or made read-only but they have been removed entirely and replaced with non-interactive content. Let's see how we can return a flattened form as a response.

I have created a third version of the order pizza form that you can open from here: http://www.tallcomponents.com/demos/pizza/form3.

The order button of this PDF submits the data to a third endpoint that uses PDFKit.NET to merge the submitted data with the original form and flattens the form as follows:

[HttpPost]
public ActionResult Order3(Pizza pizza)
{
  using (FileStream file = new FileStream(
    Server.MapPath("~/Content/order-pizza3.pdf"),
    FileMode.Open, FileAccess.Read))
  {
    // import submitted data into original form
    Document document = new Document(file);
    FormData data = FormData.Create(System.Web.HttpContext.Current.Request);
    document.Import(data); 
    // flatten form
    foreach (Field field in document.Fields)
    {
      foreach (Widget widget in field.Widgets)
      {
        widget.Persistency = WidgetPersistency.Flatten;
      }
    }
    // send to browser
    Response.ContentType = "application/pdf";
    Response.AppendHeader("Content-disposition", "inline; filename=file.pdf");
    document.Write(Response.OutputStream);
    return null;
  }
}

If I now click the order button, a new Adobe Reader instance opens showing the following response:

If you try to click the fields and change the value, you will see that nothing happens. If you save the response and open the PDF with Adobe Acrobat, you will see that there are no fields.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here