Introduction
Cascading Style Sheets allow developers to create nice user interfaces for the web. They are easy to build, use, and maintain. iTextSharp can take advantage of CSS when using its built in HTML to PDF functionality. Getting the style sheet information from the CSS into iTextSharp requires the developer to read the CSS file and convert it to Dictionary
consumable by iTextSharp. This article will illustrate a simple solution for performing just that task. The included solution includes Unit Tests and an ASP.Net project which demonstrate how to use the CSSParser
.
Background
While working on an HTML to PDF utility I found the need to parse Cascading Style Sheets. There are many CSS parsers on the internet but none fit my needs. I created this simple Regular Expression based CSS parser in C# to facilitate PDF generation in iTextSharp. The requirements for the CSS Parser are as follows:
Requirements
- Read a CSS file
- Store CSS in a Collection
- Query for the classes and their properties
- Query for the elements and their properties
- Easy to maintain and enhance
- Easily feed the style information into iTextSharp to turn HTML into PDF
- It should be lean
- Something another developer can use
Using the code
The CSSParser inherits from a generic List
of KeyValuePair
. The key will be the CSS selector. The value will be another list of key value pairs. The key here is the CSS attribute name. The value will be the CSS property value. I used a generic List
instead of a Dictionary
because Cascading Style Sheets can have the same selector or attributes listed multiple times.
public partial class CSSParser : List<KeyValuePair<String,List<KeyValuePair<String,String>>>>, ICSSParser
The core of the CSS parser is a regular expression which I found on Stack Overflow (http://stackoverflow.com/a/2694121/899290). The CSSGroups
regular expression will take the stylesheet and break it up into named groups. Before parsing the CSS the CSSComments
regular expression will be used to remove CSS comments from the file.
public const String CSSGroups = @"(?<selector>(?:(?:[^,{]+),?)*?)\{(?:(?<name>[^}:]+):?(?<value>[^};]+);?)*?\}";
public const String CSSComments = @"(?<!"")\/\*.+?\*\/(?!"")";
private Regex rStyles = new Regex(CSSGroups, RegexOptions.IgnoreCase | RegexOptions.Compiled);
The Read
method is responsible for parsing the values in the style sheet and filling the generic List
. It will use the .Net Regex
class to remove any comments and populate the collections.
public void Read(String CascadingStyleSheet)
{
this.StyleSheet = CascadingStyleSheet;
if (!String.IsNullOrEmpty(CascadingStyleSheet))
{
MatchCollection MatchList = rStyles.Matches(Regex.Replace(CascadingStyleSheet,
RegularExpressionLibrary.CSSComments, String.Empty));
foreach (Match item in MatchList)
{
if (item != null && item.Groups != null &&
item.Groups[SelectorKey] != null &&
item.Groups[SelectorKey].Captures != null &&
item.Groups[SelectorKey].Captures[0] != null &&
!String.IsNullOrEmpty(item.Groups[SelectorKey].Value))
{
String strSelector = item.Groups[SelectorKey].Captures[0].Value.Trim();
var style = new List<KeyValuePair<String,String>>();
for (int i = 0; i < item.Groups[NameKey].Captures.Count; i++)
{
String className = item.Groups[NameKey].Captures[i].Value;
String value = item.Groups[ValueKey].Captures[i].Value;
if (!String.IsNullOrEmpty(className) && !String.IsNullOrEmpty(value))
{
className = className.TrimWhiteSpace();
value = value.TrimWhiteSpace();
if (!String.IsNullOrEmpty(className) && !String.IsNullOrEmpty(value))
{
style.Add(new KeyValuePair<String,String>(className, value));
}
}
}
this.Add(new KeyValuePair<String,List<KeyValuePair<String,String>>>(strSelector, style));
}
}
}
}
Once the list is populated it’s a simple matter of using LINQ or Lambda expressions to pull the information you need. The Classes
and Elements
properties expose the values of the style sheet as a Dictionary
which can be fed to iTextSharp.
public Dictionary<String, Dictionary<String,String>> Classes
{
get
{
if (classes == null || classes.Count == 0)
{
this.classes = this.Where(cl => cl.Key.StartsWith("."))
.ToDictionary(cl => cl.Key.Trim(new Char[] { '.' }), cl => cl.Value
.ToDictionary(p => p.Key, p => p.Value));
}
return classes;
}
}
public public Dictionary<String, Dictionary<String,String>> Elements
{
get
{
if (elements == null || elements.Count == 0)
{
elements = this.Where(el => !el.Key.StartsWith("."))
.ToDictionary(el => el.Key, el => el.Value
.ToDictionary(p => p.Key, p => p.Value));
}
return elements;
}
}
Using the CSS Parser
The CSSParser
gives you two options to read a Cascading Style Sheet, read a CSS file or a string. The ReadCSSFile
method will read a CSS file and populate the collections. You can read a String containing CSS information by calling the Read
method or passing the CSS values to the constructor.
void lnkParseCSSFile_Click(object sender, EventArgs e)
{
CSSParser parser = new CSSParser();
parser.ReadCSSFile(Server.MapPath("~/CSSParserStyle.css"));
this.divOriginalCSS.InnerHtml = parser.StyleSheet.FixLineBreakForWeb().FixTabsForWeb().FixSpaceForWeb();
this.divParsedCSS.InnerHtml = parser.ToString();
this.spnOriginalCSSLength.InnerText = parser.StyleSheet.Length.ToString();
this.spnParsedCSSLength.InnerText = this.divParsedCSS.InnerHtml.Length.ToString();
}
Points of Interest
The
CSSParser
Elements
and
Classes
properties target
iTextSharp version
5.x
History
- Version 1.0- Initial Release