In this article, you will learn more about Markdig extensions and supporting classes, and how to create the renderer and the parser. You will also see how to add an initialisation extension method
Markdig, according to its description, "is a fast, powerful, CommonMark compliant, extensible Markdown processor for .NET". While most of our older projects use MarkdownDeep (including an increasingly creaky cyotek.com), current projects use Markdig and thus far, it has proven to be an excellent library.
One of the many overly complicated aspects of cyotek.com is that in addition to the markdown processing, every single block of content is also run through a byzantine number of regular expressions for custom transforms. When cyotek.com is updated to use Markdig, I definitely don't want these expressions to hang around. Enter, Markdig extensions.
Markdig extensions allow you extend Markdig to include additional transforms, things that might not conform to the CommonMark specification such as YAML blocks or pipe tables.
MarkdownPipeline pipline;
string html;
string markdown;
markdown = "# Header 1";
pipline = new MarkdownPipelineBuilder()
.Build();
html = Markdown.ToHtml(markdown, pipline);
pipline = new MarkdownPipelineBuilder()
.UseAutoIdentifiers()
.Build();
html = Markdown.ToHtml(markdown, pipline);
Example of using an extension to automatically generate id attributes for heading elements.
I recently updated our internal crash aggregation system to be able to create MantisBT issues via our MantisSharp library. In these issues, stack traces include the line number or IL offset in the format #<number>
. To my vague annoyance, Mantis Bug Tracker treats these as hyperlinks to other issues in the system in a similar fashion to how GitHub automatically links to issues or pull requires. It did however give me an idea to create a Markdig extension that performs the same functionality.
Deciding on the Pattern
The first thing you need to do is decide the markdown pattern to trigger the extension. Our example is perhaps a bit too basic as it is a simple #<number>
, whereas if you think of other issue systems such as JIRA, it would be <string>-<number>
. As well as the "body" of the pattern, you also need to consider the characters which surround it. For example, you might only allow white space, or perhaps brackets or braces - whenever I reference a JIRA issue, I tend to surround them in square braces, e.g., [PRJ-1234]
.
The other thing to consider is the criteria of the core pattern. Using our example above, should we have a minimum number of digits before triggering, or a maximum? #999999999
is probably not a valid issue number!
Extension Components
A Markdig extension is comprised of a few moving parts. Depending on how complicated your extension is, you may not need all parts, or could perhaps reuse existing parts.
- the extension itself (always required)
- a parser
- a renderer
- an object used to represent data in the abstract syntax tree (AST)
- an object used to configure the extension functionality
In this plugin, I'll be demonstrating all of these parts.
Happily enough, there's actually already an extension built into Markdig for rendering JIRA links which was great as a getting started point, including the original MarkdigJiraLinker extension by Dave Clarke. As I mentioned at the start, Markdig has a lot of extensions, some simple, some complex - there's going to be a fair chunk of useful code in there to help you with your own.
Supporting Classes
I'm actually going to create the components in a backwards order from the list above, as each step depends on the one before it, so it would make for awkward reading if I was referencing things that don't yet exist.
To get started with some actual code, I'm going to need a couple of supporting classes - an options object for configuring the extension (at the bare minimum, we need to supply the base URI of a MantisBT installation), and also class to present a link in the AST.
First, the options
class. As well as that base URI, I'll also add an option to determine if the links generated by the application should open in a new window or not via the target
attribute.
public class MantisLinkOptions
{
public MantisLinkOptions()
{
this.OpenInNewWindow = true;
}
public MantisLinkOptions(string url)
: this()
{
this.Url = url;
}
public MantisLinkOptions(Uri uri)
: this()
{
this.Url = uri.OriginalString;
}
public bool OpenInNewWindow {get; set; }
public string Url { get; set; }
Next up is the object which will present our link in the syntax tree. Markdig nodes are very similar to HTML, coming in two flavours - block and inline. In this article, I'm only covering simple inline nodes.
I'm going to inherit from LeafInline
and add a single property to hold the Mantis issue number.
There is actually a more specific LinkInline
element which is probably a much better choice to use (as it also means you shouldn't need a custom renderer). However, I'm doing this example the "long way" so that when I move onto the more complex use cases I have for Markdig, I have a better understanding of the API.
[DebuggerDisplay("#{" + nameof(IssueNumber) + "}")]
public class MantisLink : LeafInline
{
public StringSlice IssueNumber
{
get;
set;
}
}
String vs StringSlice
In the above class, I'm using the StringSlice
struct
offered by Markdig. You can use a normal string
if you wish (or any other type for that matter), but StringSlice
was specifically designed for Markdig to improve performance and reduce allocations. In fact, that's how I heard of Markdig to start with, when I read Alexandre's comprehensive blog post on the subject last year.
Creating the Renderer
With the two supporting classes out the way, I can now create the rendering component. Markdig renderers take an element from the AST and spit out some content. Easy enough - we create a class, inherit HtmlObjectRenderer<T>
(where T
is the name of your AST class, e.g., MantisLink
) and override the Write
method. If you are using a configuration class, then creating a constructor to assign that is also a good idea.
public class MantisLinkRenderer : HtmlObjectRenderer<MantisLink>
{
private MantisLinkOptions _options;
public MantisLinkRenderer(MantisLinkOptions options)
{
_options = options;
}
protected override void Write(HtmlRenderer renderer, MantisLink obj)
{
StringSlice issueNumber;
issueNumber = obj.IssueNumber;
if (renderer.EnableHtmlForInline)
{
renderer.Write("<a href=\"").Write
(_options.Url).Write("view.php?id=").Write(issueNumber).Write('"');
if (_options.OpenInNewWindow)
{
renderer.Write(" target=\"blank\" rel=\"noopener noreferrer\"");
}
renderer.Write('>').Write('#').Write(issueNumber).Write("</a>");
}
else
{
renderer.Write('#').Write(obj.IssueNumber);
}
}
}
So how does this work? The Write
method we're overriding supplies the HtmlRenderer
to write to, and the MantisLink
object to render.
First, we need to check if we should be rendering HTML by checking the EnableHtmlForInline
property. If this is false
, then we output the plain text, e.g., just the issue number and the #
prefix.
If we are writing full HTML, then it's a matter of building a HTML a
tag with the fully qualified URI generated from the base URI in the options object, and the AST node's issue number. We also add a target
attribute if the options state that links should be in a new window. If we do add a target
attribute, I'm also adding a rel
attribute as per MDN guidelines.
Notice how the HtmlRenderer
objects Write
method happily accepts string
, char
or StringSlice
arguments, meaning we can mix and match to suit our purposes.
Creating the Parser
With rendering out of the way, it's time for the most complex part of creating an extension - parsing it from a source document. For that, we need to inherit from InlineParser
and overwrite the Match
method, as well as setting up the characters that would trigger the parse routine - that single #
character in our example.
public class MantisLinkInlineParser : InlineParser
{
private static readonly char[] _openingCharacters =
{
'#'
};
public MantisLinkInlineParser()
{
this.OpeningCharacters = _openingCharacters;
}
public override bool Match(InlineProcessor processor, ref StringSlice slice)
{
bool matchFound;
char previous;
matchFound = false;
previous = slice.PeekCharExtra(-1);
if (previous.IsWhiteSpaceOrZero() || previous == '(' || previous == '[')
{
char current;
int start;
int end;
slice.NextChar();
current = slice.CurrentChar;
start = slice.Start;
end = start;
while (current.IsDigit())
{
end = slice.Start;
current = slice.NextChar();
}
if (current.IsWhiteSpaceOrZero() || current == ')' || current == ']')
{
int inlineStart;
inlineStart = processor.GetSourcePosition
(slice.Start, out int line, out int column);
processor.Inline = new MantisLink
{
Span =
{
Start = inlineStart,
End = inlineStart + (end - start) + 1
},
Line = line,
Column = column,
IssueNumber = new StringSlice(slice.Text, start, end)
};
matchFound = true;
}
}
return matchFound;
}
}
In the constructor, we set the OpeningCharacters
property to a character array. When Markdig is parsing content, if it comes across any of the characters in this array, it will automatically call your extension.
This neatly leads us onto the meat of this class - overriding the Match
method. Here, we scan the source document and try to build up our node. If we're successful, we update the processor and let Markdig handle the rest.
We know the current character is going to be #
as this is our only supported opener. However, we need to check the previous character to make sure that we try and process an distinct entity, and not a #
character that happens to be in the middle of another string.
previous = slice.PeekCharExtra(-1);
if (previous.IsWhiteSpaceOrZero() || previous == '(' || previous == '[')
Here, I use an extension method exposed by Markdig to check if the previous character was either whitespace, or nothing at all, i.e., the start of the document. I'm also checking for (
or [
characters in case the issue number has been wrapped in brackets or square braces.
If we pass this check, then it's time to parse the issue number. First, we advance the character stream (to discard the #
opener) and also initalize the values for creating a final StringSlice
if we're successful.
slice.NextChar();
current = slice.CurrentChar;
start = slice.Start;
end = start;
As our GitHub/MantisBT issue numbers are just that, plain numbers, we simply keep advancing the stream until we run out of digits.
while (current.IsDigit())
{
end = slice.Start;
current = slice.NextChar();
}
As I'm going to work exclusively with the StringSlice
struct, I'm only recording where the new slice will end. Even if you wanted to use a more traditional string, it probably makes sense to keep the above construct and then build your string at the end.
Once we've ran out of digits, we now essentially do a reverse of the check we made at the start - now we want to see if the next character is white space, the end of the stream, or a closing bracket/brace.
if (current.IsWhiteSpaceOrZero() || current == ')' || current == ']')
I didn't add a check for this, but potentially you should also look for matching pair - so if a bracket was used at the start, a closing bracket should therefore be present at the end.
Assuming this final check passes, that means we have a valid #<number>
sequence, and so we create a new MantisLink
object with the IssueNumber
property populated with a brand new string slice
. We then assign this new object to the Inline
property of the processor.
inlineStart = processor.GetSourcePosition(slice.Start, out int line, out int column);
processor.Inline = new MantisLink
{
Span =
{
Start = inlineStart,
End = inlineStart + (end - start)
},
Line = line,
Column = column,
IssueNumber = new StringSlice(slice.Text, start, end)
};
I'm not sure if the Line
and Column
properties are used directly by Markdig, or if they are only for debugging or advanced AST scenarios. I'm also uncertain what the purpose of setting the Span
property is - even though I based this code on the code from the Markdig repository, it doesn't seem to quite match up should I print out its contents. This leaves me wondering if I'm setting the wrong values. So far, I haven't noticed any adverse effects though.
Creating the Extension
The first thing to set up is the core extension. Markdig extensions implement the IMarkdownExtension
interface. This simple interface exposes two overloads of a Setup
method for configuring the parsing and rendering aspect of the extension.
One of these overloads is for customising the pipeline - we'll add our parser here. The second overload is for setting up the renderer. Depending on the nature of your extension, you may only need one or the other.
As this class is responsible for creating any renders or parsers your extension needs, that also means it needs to have access to any required configuration classes to pass down.
public class MantisLinkerExtension : IMarkdownExtension
{
private readonly MantisLinkOptions _options;
public MantisLinkerExtension(MantisLinkOptions options)
{
_options = options;
}
public void Setup(MarkdownPipelineBuilder pipeline)
{
OrderedList<InlineParser> parsers;
parsers = pipeline.InlineParsers;
if (!parsers.Contains<MantisLinkInlineParser>())
{
parsers.Add(new MantisLinkInlineParser());
}
}
public void Setup(MarkdownPipeline pipeline, IMarkdownRenderer renderer)
{
HtmlRenderer htmlRenderer;
ObjectRendererCollection renderers;
htmlRenderer = renderer as HtmlRenderer;
renderers = htmlRenderer?.ObjectRenderers;
if (renderers != null && !renderers.Contains<MantisLinkRenderer>())
{
renderers.Add(new MantisLinkRenderer(_options));
}
}
}
Firstly, I make sure the constructor accepts an argument of the MantisLinkOptions
class to pass to the renderer.
In the Setup
overload that configures the pipeline, I first check to make sure the MantisLinkInlineParser
parser isn't already present; if not I add it.
In a very similar fashion, in the Setup
overload that configures the renderer, I first check to see if a HtmlRenderer
renderer was provided - after all, you could be using a custom renderer which wasn't HTML based. If I have got a HtmlRenderer
renderer, then I do a similar check to make sure a MantisLinkRenderer
instance isn't present, and if not I create on using the provided options class and add it.
Adding an Initialisation Extension Method
Although you could register extensions by directly manipulating the Extensions
property of a MarkdownPipelineBuilder
, generally Markdig extensions include an extension method which performs the boilerplate code of checking and adding the extension. The extension below checks to see if the MantisLinkerExtension
has been registered with a given pipeline, and if not, adds it with the specified options.
public static MarkdownPipelineBuilder UseMantisLinks
(this MarkdownPipelineBuilder pipeline, MantisLinkOptions options)
{
OrderedList<IMarkdownExtension> extensions;
extensions = pipeline.Extensions;
if (!extensions.Contains<MantisLinkerExtension>())
{
extensions.Add(new MantisLinkerExtension(options));
}
return pipeline;
}
Using the Extension
MarkdownPipeline pipline;
string html;
string markdown;
markdown = "See issue #1";
pipline = new MarkdownPipelineBuilder()
.Build();
html = Markdown.ToHtml(markdown, pipline);
pipline = new MarkdownPipelineBuilder()
.UseMantisLinks(new MantisLinkOptions("https://issues.cyotek.com/"))
.Build();
html = Markdown.ToHtml(markdown, pipline);
Example of using an extension to automatically generate links for MantisBT issue numbers.
Wrapping Up
In this article, I showed how to introduce new inline elements parsed from markdown. This example at least was straightforward, however there is more that can be done. More advanced extensions such as pipeline tables have much more complex parsers that generate a complete AST of their own.
Markdig supports other ways to extend itself too. For example, the Auto Identifiers shown at the start of the article doesn't parse markdown but instead manipulates the AST even as it is being generated. The Emphasis Extra extension injects itself into another extension to add more functionality to that. There appears to be quite a few ways you can hook into the library in order to add your own custom functionality!
A complete sample project can be downloaded from the URL below or from the GitHub page for the project.
Although I wrote this example with Mantis Bug Tracker in mind, it wouldn't take very much effort at all to make it cover innumerable other websites.
Update History
- 5th August, 2017: First published
- 22nd November, 2020: Updated formatting
All content Copyright (c) by Cyotek Ltd or its respective writers. Permission to reproduce news and web log entries and other RSS feed content in unmodified form without notice is granted provided they are not used to endorse or promote any products or opinions (other than what was expressed by the author) and without taking them out of context. Written permission from the copyright owner must be obtained for everything else.