Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / DevOps

Convert Microsoft Word to PDF - using Java and LibreOffice (UNO API)

4.50/5 (8 votes)
24 May 2016CPOL3 min read 72.7K   1.9K  
How to use the programmatic interface of LibreOffice (called the UNO API) to load, manipulate and save documents

Introduction

The steps below create a Java program to load a Microsoft Word document into LibreOffice using the UNO API, make "mail merge" style changes, and save it to PDF format.

Using the Code

The process is:

  1. Set up your environment
  2. Initialise
  3. Load the Microsoft Word document
  4. Substitute the data (mail merge)
  5. Save as PDF
  6. Shutdown

This starting point will let you test all sorts of document conversions and mail merging scenarios.

Here are the input Microsoft Word Doc and output PDF file used in the example below: 

Step 1 - Setup

You need to add the LibreOffice JARs to the class path. These JARs give us access to the Java UNO API that we'll be calling to do all sorts of magic.  In your install of Libre Office, look for the following JARs and make sure they are in your project class path:

Java
Install Libre Office

Create a Java project in your favorite editor and add these to your class path:
  [Libre Office Dir]/URE/java/juh.jar
  [Libre Office Dir]/URE/java/jurt.jar
  [Libre Office Dir]/URE/java/ridl.jar
  [Libre Office Dir]/program/classes/unoil.jar

Create a new Java class. The following code snippets can be copied to create the working program.

Step 2 - Starting the LibreOffice Process

Boot a Libre Office process that will listen to our requests.

Java
import java.util.Date;
import java.io.File;
import com.sun.star.beans.PropertyValue;
import com.sun.star.comp.helper.Bootstrap;
import com.sun.star.frame.XComponentLoader;
import com.sun.star.frame.XDesktop;
import com.sun.star.frame.XStorable;
import com.sun.star.lang.XComponent;
import com.sun.star.lang.XMultiComponentFactory;
import com.sun.star.text.XTextDocument;
import com.sun.star.uno.UnoRuntime;
import com.sun.star.uno.XComponentContext;
import com.sun.star.util.XReplaceDescriptor;
import com.sun.star.util.XReplaceable;

public class MailMergeExample {

public static void main(String[] args) throws Exception {

 // Initialise
 XComponentContext xContext = Bootstrap.bootstrap();

 XMultiComponentFactory xMCF = xContext.getServiceManager();
 
 Object oDesktop = xMCF.createInstanceWithContext(
      "com.sun.star.frame.Desktop", xContext);
 
 XDesktop xDesktop = (XDesktop) UnoRuntime.queryInterface(
      XDesktop.class, oDesktop);

Step 3 - Loading a Document

The code below loads a template into the LibreOffice engine. Notice 2 things:

  1. It expects to find the template as c:/projects/letterTemplate.doc (so you should change this as required).
  2. The load process uses a "Hidden" flag. This can be set to false to see the process working.
Java
// Load the Document
String workingDir = "C:/projects/";
String myTemplate = "letterTemplate.doc";

if (!new File(workingDir + myTemplate).canRead()) {
 throw new RuntimeException("Cannot load template:" + new File(workingDir + myTemplate));
}

XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime
 .queryInterface(com.sun.star.frame.XComponentLoader.class, xDesktop);

String sUrl = "file:///" + workingDir + myTemplate;

PropertyValue[] propertyValues = new PropertyValue[0];

propertyValues = new PropertyValue[1];
propertyValues[0] = new PropertyValue();
propertyValues[0].Name = "Hidden";
propertyValues[0].Value = new Boolean(true);

XComponent xComp = xCompLoader.loadComponentFromURL(
 sUrl, "_blank", 0, propertyValues);

Here is a screen shot of the example Microsoft Word document we are using as a template:

Image 1

Step 4 - Search and Replace

The search and replace looks for:

  • "<date>" and replaces it with the current date and time
  • "<addressee>" and
  • "<signatory>"
Java
// Search and replace
XReplaceDescriptor xReplaceDescr = null;
XReplaceable xReplaceable = null;

XTextDocument xTextDocument = (XTextDocument) UnoRuntime
  .queryInterface(XTextDocument.class, xComp);

xReplaceable = (XReplaceable) UnoRuntime
  .queryInterface(XReplaceable.class, xTextDocument);

xReplaceDescr = (XReplaceDescriptor) xReplaceable
  .createReplaceDescriptor();

// mail merge the date
xReplaceDescr.setSearchString("<date>");
xReplaceDescr.setReplaceString(new Date().toString());
xReplaceable.replaceAll(xReplaceDescr);

// mail merge the addressee
xReplaceDescr.setSearchString("<addressee>");
xReplaceDescr.setReplaceString("Best Friend");
xReplaceable.replaceAll(xReplaceDescr);

// mail merge the signatory
xReplaceDescr.setSearchString("<signatory>");
xReplaceDescr.setReplaceString("Your New Boss");
xReplaceable.replaceAll(xReplaceDescr);

Step 5 - Export to PDF

The Libre Office filter name "writer_pdf_export" is used to save as a PDF document.

Java
// save as a PDF
XStorable xStorable = (XStorable) UnoRuntime
  .queryInterface(XStorable.class, xComp);

propertyValues = new PropertyValue[2];
propertyValues[0] = new PropertyValue();
propertyValues[0].Name = "Overwrite";
propertyValues[0].Value = new Boolean(true);
propertyValues[1] = new PropertyValue();
propertyValues[1].Name = "FilterName";
propertyValues[1].Value = "writer_pdf_Export";

// Appending the favoured extension to the origin document name
String myResult = workingDir + "letterOutput.pdf";
xStorable.storeToURL("file:///" + myResult, propertyValues);

System.out.println("Saved " + myResult);

Here is a screen shot of the finished letter output as a PDF:

Image 2

Step 6 - Shutdown

This terminates the process launched in step 2 above. Instead of terminating, more load, manipulate and save processing could be done.

Java
 // shutdown
 xDesktop.terminate();
 }
}

Files used in this article can be downloaded from here.

Points of Interest

Multithreading

It's possible, but not advisable to use this approach in a multi-threaded fashion. Experience has shown that this leads to instability and unpredictable results. Of course, you could launch multiple Libre Office processes to handle many requests, each in a single threaded manner.

Process and Crash Management

Under a realistic workload, there are documents that can crash the process. This means your real-production-version of this approach would need to expect for the occasional failure, clean up and restart the process. Ideally, this would all be transparent to the calling user or program.

Likewise, you want to make sure you nicely clean up any resources to use in cases of success and cases of failure. In this case, we are spawning a separate process which is definitely something you always want to clean up.

LibreOffice in 32Bit on Windows

At the time of writing, LibreOffice is only available in 32Bit for Windows. LibreOffice will install under the C:\Program Files (x86) directory.  You need to make sure that your Java application is also running using a 32Bit JRE.  Look for a JRE under the C:\Program Files (x86)\Java directory.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)