Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

FDF .NET Parser

0.00/5 (No votes)
2 Oct 2012CPOL 13.4K  
Parsing FDF files using Regex

Introduction

FDF stands for Forms Data Format, and similar to XML FDF is used to store data for archiving purposes.

The MIME-type for FDF files is Application/FDF and can be opened by Acrobat PDF plug-in. 

Background

Looking inside FDF files, you will see that it's straightforward, consists of the list of fields value-name pairs, and then a URL to the actual PDF file with the form to be filled with this data.

Using the code

Since the structure of the file is pretty easy and straightforward, parsing it was pretty easy too, I used regex (Regular Expressions) to find the fields name-value pairs, and the URL. I added a method to download the PDF file and return that as byte array. 

C#
 using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.IO;
using System.Net;

namespace FDF
{
    public class Parser
    {
        public static FDFData Parse(String FileName)
        {
           
            FDFData result = new FDFData();
            StreamReader reader = new StreamReader(FileName);
            String FDFData = reader.ReadToEnd();
            string strRegex = 
                @"<<\s/V\s\((?<Value>.*?)\)\s/T\s\((?<Name>.*?)\)\s\s>>|/F\s\((?<URL>.*?)\)";
            RegexOptions myRegexOptions = RegexOptions.None;
            Regex myRegex = new Regex(strRegex, myRegexOptions);
            foreach (Match myMatch in myRegex.Matches(FDFData))
                if (myMatch.Success)
                    if (!String.IsNullOrEmpty(myMatch.Groups["URL"].Value)) 
                        result.Url = myMatch.Groups["URL"].Value;
                    else
                        result.Fields[myMatch.Groups["Name"].Value] = 
                                myMatch.Groups["Value"].Value.Replace("\\", "");
                  
            return result;
        }
    }
    public class FDFData
    {
        private Dictionary<String, String> _Fields;
        private String _Url;
        private byte[] _PDF;

        public byte[] PDF
        {
            get 
            {
                if (_PDF == null)
                    _PDF = (new WebClient()).DownloadData(this.Url);
                return _PDF; 
            }
            set { _PDF = value; }
        }
        public FDFData()
        {
            _Fields = new Dictionary<string, string>();
            _Url = "";
        }

        public String Url
        {
            get { return _Url; }
            set { _Url = value; }
        }

        public Dictionary<String, String> Fields
        {
            get { return _Fields; }
            set { _Fields = value; }
        }

    }
} 

Points of Interest

FDF files can include duplicate field names with same or different values, I think if you reached this case you have something wrong in the FDF creation process. to keep the fields searchable I used a Dictionary.

History 

Version 1.0 10/02/2012.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)