Cloud services such as Amazon S3, Google
Cloud Storage, Google Docs, Dropbox, Box, Microsoft Azure and others are
excellent repositories for document storage. Each of these services allows
users to off-load document storage to the cloud. Although any type of document
can be stored in the Cloud, it is often desirable to store all documents in the
standard PDF format with all the benefits that come with using PDF. Converting
documents to PDF and then uploading them to the Cloud is typically a two-step
process that requires manual operations such as printing the document to a
virtual printer and logging-in to web site to upload.
In this paper we will describe a solution
for easily converting documents into PDF for storage in the Cloud in a single
automated process.
Requirements (Server Side)
A Cloud service that exposes an API to
allow developers to automate the tasks of storing and retrieving documents.
Taking a closer look at cloud services, we
found a number of issues when trying to decide on a service to use.
-
There are really no "free" evaluation services.
Google Cloud Storage, Amazon S3 and Microsoft Azure services all require credit
card information for evaluation.
-
Google Docs does offer a free evaluation but
there is a one month time period.
-
Although both Dropbox and Box services offer
free developer evaluation accounts, both of these applications require clients
or users to first login to their website before file transfers occur. This
breaks up the flow of the application.
In looking at the cloud storage services
available we ended up focusing on Google Docs which is part of Google Apps for
Business and Microsoft Azure (Blob Storage Service).
Requirements (Client Side)
When looking at the cloud storage printing
solution we wanted to achieve:
-
We needed a tool to convert all user documents
into PDF before sending them to the Cloud
-
We needed direct transfer or no client PC
storage for safety and efficiency
-
We needed a solution that is easy to implement
and use
-
We needed a high performance engine in order not
to hinder the user’s workflow
The Amyuni PDF Converter responded to all
our requirements. PDF Converter is a virtual printer driver that converts
output sent to it from a printing application into PDF format. The printer
driver which is certified by Microsoft for all 32 and 64bit editions of
windows, offers the developer complete control over the printing process. The
Amyuni PDF Converter product which is needed to run the sample application
described in this paper can be downloaded from http://www.amyuni.com/en/enduser/pdfconverterend.
Implementation
The Amyuni PDF Converter printer driver API
enables developers to intercept the datastream coming from the printer driver
and handle it in their own custom application. The developer has the option to
store the datastream locally, to a network drive or even to the cloud.
The intercepting of the datastream is
accomplished by configuring the PDF Converter to call a custom DLL during the
print job. This custom DLL will perform the "work" of uploading to the cloud.
For both Google and Azure cloud services
the Amyuni PDF Converter is installed and configured in the same manner. The
PDF Converter is first installed on a PC or Server, then using our API the
developer will need to configure to direct the printer output to our custom
DLL. The code snippet below illustrates this process.
CDIntfEx.CDIntfExClass PDF = new CDIntfEx.CDIntfExClass();
PDF.DriverInit("Amyuni PDF Cloud Converter");
PDF.EnablePrinter(strLicenseTo, strActivationCode);
PDF.FileNameOptionsEx = 0x2000000;
PDF.SetPrinterParamStr("PageProcessor","PageProc.dll");
PDF.SetDefaultConfig();
The PageProc DLL communicates with the acListener
service and sends it the data stream that it receives from the printer driver.
Although the DLL can be programmed to send the output directly to the cloud, we
chose to use the intermediary acListener service because it gave us the
following advantages:
-
Control is quickly returned to the user before
the data is fully uploaded to the server
-
The listener service can authenticate the user
only once whereas the DLL would need to authenticate the user each time it is
loaded
-
Previewing of the PDF document can be easily
implemented in the listener by using the Amyuni PDF Creator.Net viewer prior to
uploading the document. PDF Creator.Net is available for download from the
following URL:
http://www.amyuni.com/en/developer/pdfcreator
Uploading to Google Docs
Google Docs provides the following
advantages:
-
Free 30 day business account access.
-
Extremely Large user base
-
Google Docs gives the user a visual
representation of the documents.
-
Online documents with real-time collaboration.
Print from your PC and make document accessible to other users or accessible
from other PCs.
In order to use the Google Docs (Google
Documents List API), you will need to sign up for a Google Apps for Business
account. Evaluating this service does not require a credit card but will
require you adding a HTML tag to the index page of your website for
authentication.
What distinguishes Google Docs from other
services is that it offers the user a web interface to view their documents and
collaborate with other users.
All of the uploading to Google Docs happens
the Upload ()
method.
The code snippet below illustrates how the
MemoryStream data = _document.GetData();
This code snippet below, which is in the
acListener service, handles the uploading of the databstream to the cloud.
public void Upload()
{
DocumentsService service = new
DocumentsService("AmyuniTech-AmyuniCloudPrinterApp-v1.0");
GDataGAuthRequestFactory reqFactory =
(GDataGAuthRequestFactory)service.RequestFactory;
reqFactory.KeepAlive = false;
reqFactory.ProtocolMajor = 3;
service.setUserCredentials(_googleUsername , _googleUserPassword);
DocumentEntry entry = null;
MemoryStream data = _document.GetData();
data.Seek(0, SeekOrigin.Begin);
try
{
String contentType = (String)DocumentsService.DocumentTypes["PDF"];
entry = service.Insert(new Uri(DocumentsListQuery.documentsBaseUri),
data,
contentType,
_googleDocumentName) as DocumentEntry;
}
catch (Exception ex)
{
System.Windows.Forms.MessageBox.Show(ex.Message.ToString());
}
finally
{
}
_document.ReleaseStream();
data.Close();
}
Uploading to Microsoft Azure Blob Storage
The process of uploading to Microsoft Azure Blob Storage Service is similar.
The main advantage of Microsoft Blob Storage Service for
developers is its extensive .NET documentation.
Microsoft Azure Blob Storage Service
(MABSS) uses a container and blob concept to mimic the file system.
Microsoft defines a Container as "A
container provides a grouping of a set of blobs. All blobs must be in a
container. An account can contain an unlimited number of containers. A container
can store an unlimited number of blobs."
Microsoft defines a Blob as "file of any
type and size".
In the code snippet below the Amyuni PDF
Converter is passing the datastream on the Blog object’s UploadFromStream()
method to upload the file to the cloud.
blob.UploadFromStream(data);
public void Upload()
{
MemoryStream data = _document.GetData();
data.Seek(0, SeekOrigin.Begin);
try
{
CloudStorageAccount storageAccount =
CloudStorageAccount.Parse(ConfigurationManager.AppSettings["StorageAccountConnectionString"]);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("pdfdocuments");
container.CreateIfNotExist();
CloudBlob blob = container.GetBlobReference(_pdfDocumentName + ".pdf");
blob.UploadFromStream(data);
}
catch (Exception ex)
{
System.Windows.Forms.MessageBox.Show(ex.Message);
}
finally
{
}
_document.ReleaseStream();
data.Close();
}
The acListener services for both Google Docs and Azure can be requested from Jose by filling out our contact form at http://www.amyuni.com/en/company/contactform/