MuPDF is an open-sourced, high performance PDF rendering and editing engine written in C. However, the compilation results of its source code do not contain a DLL for use from other languages, such as C#, Visual BASIC, etc. This article will show you the way to compile the source code to a dynamic link library.
Note about the download:
The download contains a pre-compiled MuPDF DLL (versioned 1.19) with a simple C# demo project which utilizes functionalities in the DLL to convert PDF documents into image files.
As an alternative to downloading the above files, you can also clone my repository on GitHub to have all code files you need and get started with the MuPDF.sln solution file.
Note about SumatraPDF:
A previously written article discussed the possibility of getting a DLL file out of the SumatraPDF project, which utilized MuPDF very much. By compiling the SumatraPDF project with Visual Studio 2019, it is very easily to get a MuPDF DLL file named libmupdf.dll
.
What We Need
The source code of MuPDF can be downloaded from its official website, or from the mirror on GitHub.com. The following list contains all we need to compile the MuPDF source code.
- Visual Studio 2019 with the C++ desktop development payload.
- Windows 10 SDK.
- (Optional, but recommended) Python to extract function names to a definition file for the compilation of the DLL file.
- About 1 GB free bytes on your hard drive for code compilation.
Compiling the Source Code
The source code of MuPDF is separated to several directories containing thousands of code files.
- source: the C code files
- include: the header files
- thirdparty: the code files of related open-source projects
- platform: projects for code compilation
- resources: resources (fonts, character maps, color profiles, PDF names, etc.) used by the engine
- scripts: some code files which help compilation
The Related Project Files
Navigate to the platform/win32 directory, you will see quite a few C project files and a solution file named mupdf.sln.
Note:
I suggest you make a copy of mupdf.sln and work on that copy later.
Always avoid modifying the source code unless you know what you are doing and want to participate in contributing to the MuPDF
project. Have you modified the source code, you may have trouble merging the source code later, when you want to keep synchronized with the MuPDF
project.
Open that solution file, you will see more than 10 projects loaded.
The most important projects about compiling a DLL file are listed below:
libmupdf
: The project to compile a static library file, libmupdf.lib, which will be used to compile the DLL file libresources
: The project to handle resources, used by libmupdf
libthirdparty
: The related open-source projects, used by libmupdf
bin2coff
: The automation project which generates font resource files for the other projects.
If your goal is to compile the MuPDF DLL file only, the rest of the projects in the solution are ignorable. So you can safely unload them or remove them from the solution.
The projects must be compiled in the reverse order as they are listed above, from bin2coff to libmupdf, for the dependencies among them.
Compiling Source Projects
Before MuPDF 1.17, I used Windows 7 SDK to compile the source code and the project files needed to be upgraded in order to be compiled successfully. From MuPDF 1.17 on, Windows 10 SDK was used and Visual Studio 2019 became the officially supported compiler. We could compile them with Visual Studio 2019 without any problem.
About missing code files: If you download the source code, you have to search the web and download all needed thirdparty source code files. So, it is recommended to clone the project with Visual Studio rather then downloading them.
Creating the DLL
We are very close to what we want now.
To obtain a dynamic library file out of the static library file, we need to create another project.
Tip:
The reason why I don't modify the libmupdf project to make it generate a DLL file is that I want to keep the other projects referencing libmupdf in the solution still be compilable, and easier the process when we synchronize the source code files later when the MuPDF source code gets updated.
To create the DLL file, we will have to do three things:
- Create a new project to output a DLL file.
- Reference related static library files in that project.
- Define functions being exposed in the DLL file.
Setting Up the DLL Project
To create a DLL file, we create a new C project in the MuPDF
solution. I simply named it MuPDFLib
in my solution.
Modify the project as the following list by selecting the MuPDFLib
project we created above and clicking the "Project" menu and selecting the "Properties" command:
- Select All Configurations and All Platforms.
- In the Platform Toolset, use the Visual Studio 2019 (v142).
- Change the SDK version to 10 if it isn't.
- Change the Configuration to Dynamic Library (.dll).
- Switch to the Linker/Input section in the project properties dialog, and set the Module Definition File property to "libmupdf.def" (we will generate this file later).
- Click OK button to close the project properties dialog.
Referencing Related Static Library Files
Right click the References in the MuPDFLib
project in the solution explorer. From the popped-up context menu, select the Add Reference command, which will open a dialog. In that dialog, check the checkboxes besides libmupdf, libresources and libthirdparty as the following picture shows.
Creating the Definition File of DLL Functions
The above setting will not yet compile a workable DLL file out of the MuPDF
source code, unless we create the missing libmupdf.def function definition file referenced by the project. A definition file is a list which defines what functions can be exposed out of the compiled DLL file.
Functions in MuPDF
project are placed in header files in the include folder. However, since the project contains quite some scores of header files, it is not a funny task to extract those function names manually.
Fortunately, the developers of SumatraPDF (a PDF viewer application which based on MuPDF
) created a Python script to generate that def file for us. You can copy the script file from the repository of SumatraPDF and place it in the scripts folder where you store the source code of MuPDF
.
The script needs some modifications to reflect the latest changes of MuPDF. I uploaded my own version to my GitHub repository. If you already have Python 2.7 installed on your computer, place the script files, gen_libmupdf.def.py, util.py into the scripts folder, double click gen_libmupdf.def.py and it will generate the def file in the platform/win32 folder.
Having the def file generated, we can start compiling the DLL file.
Note:
You may encounter a few LNK2001
errors at compilation indicating that certain functions were not resolved. Simply delete those functions from the automatically generated def file and proceed.
If you don't have Python and you don't want to install it, you may have to manually compose the def file with a code editor, or program your own application to generate the def file.
I suggest you use the Python script to do the dirty job for you.
Eventually, you will obtain the DLL file in the platform/win32/Release (or platform/win32/Debug, etc.) folder. Where you can find the DLL file depends on your compilation configuration.
Examining Exported Functions
So far, we have already obtained the DLL file of the MuPDF
engine. You can examine whether the functions have been exported by the DLL file with the DLL Export Viewer, a small utility developed by nirsoft.net.
The exported functions in DLL Export Viewer may look like the following picture:
Shrinking the DLL File by Excluding Unwanted Fonts
Some of you may notice that the compiled DLL is a little bit large, up to about 34 megabytes. The reason why it takes so much space is that it has embedded many fonts for international character support. We can shrink its size to about 8 megabytes by excluding those font files from the compiled DLL.
To make the customized DLL, open the libmupdf
project, expand the !include/fitz folder and click the config.h file. You can see a lot of comments and definition directives there. Scroll down to the comment with a text "Choose which fonts to include
", and below, there are some commented lines, like the following:
Like it said, by defining TOFU;TOFU_CJK_EXT
, you will exclude several huge noto fonts and CJK extension fonts from the DLL file, and reduce the output size from 34 megabytes to about 8 megabytes.
WARNING:
Be careful when following the instruction in the source code: Enable the following defines to AVOID including unwanted fonts.
If you modify the source code by uncommenting the lines below the above line, you will encounter source code conflicts later when you synchronize with the updated source code.
I advice you DO NOT modify the source code; DO modify the libmupdf
project properties and add definitions to the Preprocessor Definitions property as the screenshot shows. Notice the ";
" before "TOFU
" which separates definitions from <different options>
.
Recompile the MuPDFLib
project and check out the size of the DLL file. It should be much smaller.
Consuming Functions in the DLL File
I am not going to write too much in this article on how to consume the functions in the DLL. For C# programmers, you may use the Platform Invoking technology. Previously, I wrote an article P/Invoking MuPDF library to render PDF documents. You may refer to it by clicking the link above.
The download link at the top of this article has a sample project, which demonstrates how to use P/Invoke from C# to consume functions in the compiled DLL file.
Keeping Synchronized with the MuPDF Project
As the MuPDF
project revolves, the source code may get changed.
Here's a short summary on keeping synchronized with the MuPDF
project.
- If your source code is cloned from GitHub, update it; otherwise you may download the source code from the official website and overwrite the existing code files. I prefer the former approach since it downloads less bytes at update.
- When updating the source code, merge the GitHub source into your repository.
- Source code conflicts will occur, since if you have modified the
libmupdf
project in the above procedure to shrink the size of the DLL. We have to solve the conflicts by merging the source code. Use the Source (the official code) when merging. - Don't forget to update the submodules by right clicking updated submodules in the Changes pane of the Team Explorer and execute the corresponding commands, otherwise your project might not compile. Use the Source when resolving any submodule conflicts.
- Regenerate the def file for DLL exporting by executing the
gen_libmupdf.def.py
script if necessary. - Tweak the
libmupdf
project to shrink the output DLL file by defining font related preprocessor definitions if you want. - Recompile the
MuPDFLib
project and get the DLL file. - Strip inexisting functions in the def file if you encounter
LNK2001
errors. - Check the commit history of
MuPDF
project and see whether header files (*.h files) in the project have been changed. When changes have occurred, find out what has been changed by comparing commission histories. - Change your source code to adapt to the new
MuPDF
project API if it changes. It is essential when the version of MuPDF changes.
History
- Initial publication in November, 2017
- Updated source code in the GitHub fork, and fixed typos in font resource section.
- Updated source code in the GitHub fork, recompiled the DLL, updated article to reflect the latest source code changes on November 24th, 2018
- Updated source code in the GitHub fork, recompiled the DLL, updated article to add more details to keeping sync with the MuPDF project and reflect the latest source code changes on July 18th, 2019
- Updated source code in the GitHub fork to MuPDF 1.16, recompiled the DLL, fixed P/Invoke calling convention issues in demo project on Aug 1st, 2019
- Revised the article and updated source code in the GitHub fork to MuPDF 1.17rc, recompiled the DLL on April 28th, 2020