Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

The *AdES Collection: CAdES, XAdES, PAdES and ASiC Implementation for Windows in C++

0.00/5 (No votes)
22 Jul 2019 1  
A standard-compliant library for secure signing

Introduction

This is an article about Advanced Electronic Signatures. We will talk about CAdES, XAdES and PAdES and their applications to digital signatures.

CAdES (CMS Advanced Electronic Signatures) is a set of extensions to Cryptographic Message Syntax (CMS) signed data making it suitable for advanced electronic signatures.

In order for a digital signature to be valid in the EU and elsewhere, it has to be in one of the CAdES profiles. These profiles define the way that certificates, CLRs, timestamps, etc. are added to the standard CMS.

The extensions to CMS are added either as authenticated attributes (those that are co-signed with the rest of the message) or unauthenticated attributes that are added after the signature.

Similar extensions exist for specialized forms: XAdES for XML signing (extensions to XML DSIG), PAdES for PDF signing (extensions to PDF) and ASiC, an extension to BDOC which defines how a digital container is structured to contain all data related to the digital signature.

Each of these forms has different levels of information included:

  • The basic form (B). This extends the basic CMS format with:
    • Four mandatory signed attributes:
      • An attribute containing the mine type of the content, always PKCS#7 data
      • An attribute containing the hash of the message signed
      • An attribute containing the certificate used for signing
      • An attribute containing the time the message was signed
    • Some optional signed attributes:
      • An attribute containing a signing policy
      • An attribute containing a commitment reason
  • The timestamp form (T) which also contains an unauthenticated attribute containing a counter signature from a trusted time signing provider
  • The C form which contains references to chain certificates and CRLs
  • The X form which appends a time stamp to the C form, either Type 1 or Type 2
  • The XL form which contains complete certificate chains and CRL
  • The XL Type 1 or XL Type 2 which also contain timestamps.
  • The A and LT forms for periodical timestamping.

Our library will, at this time, create CAdES-B, CAdES-T, CAdES-C, CAdES-X Type 2 and CAdES-XL Type 2 forms, and it is able to verify up to CAdES-T level. In the future, more forms may be added.

Coding Considerations

Let's first see why the new protocol adds these attributes. When there are no signed attributes, then the hash of the content is encrypted with the private key of the certificate and this is the digital signature. The CMS can just be that information, without even the certificate information that was used for signature (although Windows API puts the certificate nevertheless). This means that a simple CMS might only contain an encrypted hash.

When signed attributes exist, then it's these's hash that is actually encrypted. That is why that two signed attributes are then mandatory - the type of the content and the hash of the content. CAdES also needs us to contain the certificate used for signing, so the signature can be instantly verified and this also copes with the case that the public key was used to generate more than one certificate (say, with a different policy).

CAdES also forces the CMS to contain a timestamp (not from an external server like the -T forms) but from the signers‘ computer. This provides an indication of the time when the signature was put no matter if it is considered trusted or not.

When using CAdES with PDF files, this timestamp is not included since it is already put in the PDF object.

Finally, CAdES allows to specify signing policies. A policy is just a parameter string. Policies allow external verifiers to find out the reason for signing and other parameters, depending on how the signing provider defines them.

To build a standard CMS, the low level message functions are used:

  • CryptMsgOpenToEncode
  • CryptMsgUpdate
  • CryptMsgOpenToDecode
  • CryptMsgGetParam
  • CryptMsgControl
  • CryptEncodeObjectEx

To add the attributes, our job is easy or hard, depending on what Windows will do for us automatically. The type of the content and the hash of the signed message are added automatically without the need to do anything.

Adding a timestamp is also easy, because CryptEncodeObjectEx can automatically encode it for us:

// Add the timestamp
FILETIME ft = { 0 };
SYSTEMTIME sT = { 0 };
GetSystemTime(&sT);
SystemTimeToFileTime(&sT, &ft);
char buff[1000] = { 0 };
DWORD buffsize = 1000;
CryptEncodeObjectEx(PKCS_7_ASN_ENCODING, szOID_RSA_signingTime, 
                   (void*)&ft, 0, 0, buff, &buffsize);

char* bb = AddMem<char>(mem, buffsize);
memcpy(bb, buff, buffsize);
CRYPT_ATTR_BLOB* b0 = AddMem<CRYPT_ATTR_BLOB>(mem);
b0->cbData = buffsize;
b0->pbData = (BYTE*)bb;
ca[0].pszObjId = szOID_RSA_signingTime;
ca[0].cValue = 1;
ca[0].rgValue = b0;

Our helper, AddMem<>, allocates memory within a vector<vector<char>> for any sort of data that we want to be visible in the entire function. CryptEncodeObjectEx supports szOID_RSA_signingTime so it can automatically encode in ASN.1 format the timestamp for us.

Our problems start when we need to encode a SigningCertificateV2, which CryptEncodeObjectEx does not support:

SigningCertificateV2 ::=  SEQUENCE 
    {
    certs        SEQUENCE OF ESSCertIDv2,
    policies     SEQUENCE OF PolicyInformation OPTIONAL
    }

To cope with this, we have to use an ASN.1 compiler such as ASN1C (I've put it also in the repository). But the above ASN.1 definition is not enough, for we also have to define ESSCertIDv2, PolicyInformation and lots of other structures. Fortunately for you, I've put everything into the cades.asn1 file.

The ASN.1 compiler will, based on the ASN.1 definitions, generate a set of .C and .H files for us to use them in encoding, so we can then build a DER message and put it in our CMS:

// Hash of the cert
vector<BYTE> dhash;
HASH hash(BCRYPT_SHA256_ALGORITHM);
hash.hash(c->pbCertEncoded, c->cbCertEncoded);
hash.get(dhash);
BYTE* hashbytes = AddMem<BYTE>(mem, dhash.size());
memcpy(hashbytes, dhash.data(), dhash.size());

SigningCertificateV2* v = AddMem<SigningCertificateV2>(mem,sizeof(SigningCertificateV2));
v->certs.list.size = 1;
v->certs.list.count = 1;
v->certs.list.array = AddMem<ESSCertIDv2*>(mem);
v->certs.list.array[0] = AddMem<ESSCertIDv2>(mem);
v->certs.list.array[0]->certHash.buf = hashbytes;
v->certs.list.array[0]->certHash.size = (DWORD)dhash.size();
// SHA-256 is the default

// Encode it as DER
vector<char> buff3;
auto ec2 = der_encode(&asn_DEF_SigningCertificateV2,
    v, [](const void *buffer, size_t size, void *app_key) ->int
{
    vector<char>* x = (vector<char>*)app_key;
    auto es = x->size();
    x->resize(x->size() + size);
    memcpy(x->data() + es, buffer, size);
    return 0;
}, (void*)&buff3);
char* ooodb = AddMem<char>(mem, buff3.size());
memcpy(ooodb, buff3.data(), buff3.size());
::CRYPT_ATTR_BLOB bd1 = { 0 };
bd1.cbData = (DWORD)buff3.size();
bd1.pbData = (BYTE*)ooodb;
ca[1].pszObjId = "1.2.840.113549.1.9.16.2.47";
ca[1].cValue = 1;
ca[1].rgValue = &bd1;

The same nasty thing occurs when we want to add a specific signature Policy (OID 1.2.840.113549.1.9.16.2.15). Our helpers also include an OID class, created by using parts of code from this project. The same thing occurs when we add another optional attribute, the commitment type.

After calling CryptMsgUpdate to generate the signed message, we can now add any unauthenticated attributes. CryptEncodeObjectEx supports the PKCS_ATTRIBUTE format, which can contain a timestamp. To get the timestamp, Windows provides us with the function CryptRetrieveTimeStamp.

To add extra certificates to the message, we could use the ASN.1 compiler, but including all the X.509 type declarations is a pain. Instead, we only encode a simple ASN.1 sequence manually, then we get the encoded certificate or CRL directly from the PCCERT_CONTEXT or PCCRL_CONTEXT structure.

Using the Library

Our Sign() function looks like this:

struct CERTANDCRL
    {
        PCCERT_CONTEXT cert;
        std::vector<PCCRL_CONTEXT> Crls;
    };
struct CERT
    {
        CERTANDCRL cert;
        std::vector<CERTANDCRL> More;
    };

HRESULT Sign(LEVEL lev,const char* data,DWORD sz,const std::vector<CERT>& Certificates, 
             SIGNPARAMETERS& Params,std::vector<char>& Signature);    

Here:

  • lev is LEVEL::CMS,B,T,C,X,XL
  • data and sz is the data and the size
  • Certificates contain all the certificates that are used to sign the message. A message can be signed by more than one certificate. Each entry also contains an optional list of CRLs, and extra certificates and their CRLs to be added. If you specify a level less than CAdES-C, no CRLs or extra certificates are added.
  • Params is an optional structure that defines:
    • Hashing algorithm (default SHA-256)
    • Whether the message is attached or detached
    • Optional signing Policy
    • Timestamp parameters (URL, Policy, Nonce, Extensions)
    • The OID of an optional commitment type (1.2.840.113549.1.9.16.6.1 to 6)
  • Signature receives the signature

Our Verify function looks like this:

HRESULT AdES::Verify(const char* data, DWORD sz, LEVEL& lev,const char* omsg, 
          DWORD len,std::vector<char>* msg,std::vector<PCCERT_CONTEXT>* Certs,
          std::vector<string>* Policies)

Where:

  • data and sz is the signature to verify
  • lev receives the detected level (currently up to T level)
  • omsg and len contain the original message in case the signature is detached
  • msg (optional) receives the original message if the signature is attached
  • Certs (optional) receive an array with the certificates used to sign the message
  • Policies (optional) receive an array of detected signing policies, if found, per signature

Our project contains the library and a test project. It also includes a binary copy of the ASN.1 Compiler and the required include files. The library also provides a XAdES-T implementation and an ASiC-S implementation.

At the time of this writing, the ETSI cades tools have two bugs:

  • If you include a policy, it is evaluated as an error (it shows as error in their own testing samples too :p)
  • If you include multiple certificates, the timestamp must be applied to each encrypted hash. However, the ETSI tools apply each timestamp to the encrypted hash of the first signature only.

XAdES

CMS can sign any sort of binary data, so why a XML specific method? Simply because an application that is not aware of cryptography cannot read its data if they are enclosed in a CMS format. XML signing allows cryptography elements to be present in a XML document while an application can still read its data.

XMLDSIG is a protocol describing how an XML document may be signed. It defines three methods of signing:

  • Detached: The signature resides in another location
  • Enveloping: The signature contains the element to be signed
  • Enveloped: The element to be signed contains the signature as a child node

A detached XML signature can refer to any sort of data, not just XML.

My library supports all the above signature modes. It uses my XML library, updated to support canonicalization.

Current limitations:

  • Signing up to XAdES-XL
  • Up to XMLDSIG verification
  • No CDatas, namespaces or comments in XML files
  • Hash support SHA-1 and SHA-256/384

XML Canonicalization

There are unlimited valid representations for the same XML data, for example:

<foo /> equals to <foo></foo>
<foo val="yo" a=   "b" /> equals to <foo  a="b" val="yo" />

Therefore, in order to make sure that hashing does not vary, we have to make XML in canonical form, i.e., a standard 1-1 mapping of the same data to the same XML. This process is really weird, with some of the rules (but not all) below:

  • DocType headers are removed
  • Elements must not be closed with />
  • Attributes are sorted alphabetically, but xmlns: namespaced attributes go first
  • Specific whitespace is trimmed
  • Namespace declarations propagate to children (this was a real pain for me in trying to find out why hashing did not match)

A nice practical guide is here. For the sake of simplicity, our library does not support CDatas, comments, or namespaces. The official document is here.

If you use detached XML signatures to sign an XML file, then this file need not be canonicalized, because detached XML signatures can work on plain binary data. However, if you consider detached signatures, why is there a need for XML signature anyway?

XMLDSIG

The signing process is as follows:

  • Canonicalization of the element to be signed, if not using detached signatures
  • Hashing of the element
  • Creating a SignedInfo element which contains everything to be signed (transforms, message hash, algorithm)
  • Signing of the above element
  • Creating an element which contains all the information
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
    <SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#">
        <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
        <SignatureMethod Algorithm="http://www.w3.org/2001/04/xmldsig-more#rsa-sha256"/>
        <Reference URI="">
            <Transforms>
                <Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>
            </Transforms>
            <DigestMethod Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
            <DigestValue>...</DigestValue>
        </Reference>
    </SignedInfo>
    <SignatureValue xmlns:ds="http://www.w3.org/2000/09/xmldsig#">...</SignatureValue>
    <KeyInfo>
        <X509Data>
            <X509Certificate>...</X509Certificate>
        </X509Data>
    </KeyInfo>
</Signature>

The reference URI is empty when it is an enveloped signature. For detached signatures, it contains a URI to the data. Note that the signature value is not a complete PKCS#7 message, but only the encrypted hash. Therefore, we cannot use CryptSignMessage to build it, we will use the low level message functions instead (CryptMsgOpenToEncode,CryptMsgUpdate, etc.).

The SignedInfo element can contain as many references as we want, allowing us to sign many portions of data in one operation.

Windows has a CryptXML API to create a XMLDSIG but, since it is useless for XAdES, we will not use it here.

XAdES builds on XMlDSIG with the following rules:

  • All elements are namespaced with ds:
  • SignedInfo element contains references to a set of SignedProperties
  • SignedInfo contains also a reference to the certificate. This allows us to sign in one operation the message, the signed properties and the certificate.
  • Unsigned properties are also added, similar to CAdES. When using XAdES-T, the entire ds:SignatureValue element is timestamped, not just the signature.
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<doc>
    <e2>hohoho</e2>
    <e3 id="elem3"/>
    <e6 a="http://www.w3.org">
        <e7 b="http://www.ietf.org">
            <e8 c="">
                <e9 d="http://www.ietf.org"/>
            </e8>
        </e7>
    </e6>
    <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#" 



     Id="xmldsig-345B805C-ED11-469F-920A-AA82A6E02876">
        <ds:SignedInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
            <ds:CanonicalizationMethod 

                Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
            <ds:SignatureMethod Algorithm="http://www.w3.org/2001/04/xmldsig-more#rsa-sha256"/>
            <ds:Reference URI="">
                <ds:Transforms>
                    <ds:Transform Algorithm=
                        "http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>
                </ds:Transforms>
                <ds:DigestMethod Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
                <ds:DigestValue>TOSPv1v7yuYBhD56IgG5Wp8+3pkWmJEUO+QecU5/A3g=</ds:DigestValue>
            </ds:Reference>
            <ds:Reference Type="http://uri.etsi.org/01903#SignedProperties" 



             URI="#xmldsig-345B805C-ED11-469F-920A-AA82A6E02876-sigprops">
                <ds:Transforms>
                    <ds:Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
                </ds:Transforms>
                <ds:DigestMethod Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
                <ds:DigestValue>0o9tulf/mGgQCINlIJ/fcZHc6DziU6XH9x8iXiqMat8=</ds:DigestValue>
            </ds:Reference>
            <ds:Reference URI="#xmldsig-345B805C-ED11-469F-920A-AA82A6E02876-keyinfo">
                <ds:Transforms>
                    <ds:Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
                </ds:Transforms>
                <ds:DigestMethod Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
                <ds:DigestValue>hflupFvhNJWQir2grNd7QK8RWtm0m2pAE8QNdRd8jIQ=</ds:DigestValue>
            </ds:Reference>
        </ds:SignedInfo>
        <ds:SignatureValue xmlns:ds="http://www.w3.org/2000/09/xmldsig#" 



         Id="xmldsig-345B805C-ED11-469F-920A-AA82A6E02876-sigvalue">
         lknnInZl2Xxp1ZeMLM+qUj/vyoyMvkxFoOB0EcqE0z14eEW1xmpLqWT/GcJRTqceOrFLZ98C6JXtIh1
         mdhF45Avo3ZC98I3ZU/jdwZ3nOlKRa0NB8+sSQADPD3CKwLIgJh07Nr3xlHenc/yqn1whLTVU7aC1tc
         MYXYQhyeux2DJ7+qyDTKgqKIMoH4NMc+JPMp3qwu0dxqBlgZz0g43kEpsgjrakwtqRp4VqFnmHQOsIr
         6XEnBNPXk8tTV+5yshHkSF1ELRHV2feSr7RvNHA5ZtRFSs4jCd24gyVT/P5YR8MIaN3Ir4ictp9SnCX
         a+/0+g6BfsKP1ykOIk5dQzOy5w==</ds:SignatureValue>
        <ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#" 



         Id="xmldsig-345B805C-ED11-469F-920A-AA82A6E02876-keyinfo">
            <ds:X509Data>
                <ds:X509Certificate>...</ds:X509Certificate>
            </ds:X509Data>
        </ds:KeyInfo>
        <ds:Object>
            <xades:QualifyingProperties xmlns:xades="http://uri.etsi.org/01903/v1.3.2#" 



             xmlns:xades141="http://uri.etsi.org/01903/v1.4.1#" 



             Target="#xmldsig-345B805C-ED11-469F-920A-AA82A6E02876">
                <xades:SignedProperties xmlns:ds="http://www.w3.org/2000/09/xmldsig#" 



                 xmlns:xades="http://uri.etsi.org/01903/v1.3.2#" 



                 xmlns:xades141="http://uri.etsi.org/01903/v1.4.1#" 



                 Id="xmldsig-345B805C-ED11-469F-920A-AA82A6E02876-sigprops">
                    <xades:SignedSignatureProperties>
                        <xades:SigningTime>2018-09-09T10:12:24Z</xades:SigningTime>
                        <xades:SigningCertificateV2>
                            <xades:Cert>
                                <xades:CertDigest>
                                    <ds:DigestMethod Algorithm=
                                        "http://www.w3.org/2001/04/xmlenc#sha256"/>
                                    <ds:DigestValue>//0HypHbOffTJiry5S2iLFrxs6D1iPRmKZ4ShysSwxE=
                                    </ds:DigestValue>
                                </xades:CertDigest>
                                <xades:IssuerSerialV2>
                                    <ds:X509SerialNumber>18446744073709551615
                                    </ds:X509SerialNumber>
                                </xades:IssuerSerialV2>
                            </xades:Cert>
                        </xades:SigningCertificateV2>
                        <xades:SignaturePolicyIdentifier>
                            <xades:SignaturePolicyId>
                                <xades:SigPolicyId>
                                    <xades:Identifier>1.3.6.1.5.5.7.48.1</xades:Identifier>
                                </xades:SigPolicyId>
                                <xades:SigPolicyHash>
                                    <ds:DigestMethod Algorithm=
                                           "http://www.w3.org/2001/04/xmlenc#sha256"/>
                                    <ds:DigestValue>i8brzJOzs5A+2MFR/jxNzm+LaGGBQ7pNHV2uImgbY68=
                                           </ds:DigestValue>
                                </xades:SigPolicyHash>
                            </xades:SignaturePolicyId>
                        </xades:SignaturePolicyIdentifier>
                    </xades:SignedSignatureProperties>
                    <xades:SignedDataObjectProperties>
                        <xades:CommitmentTypeIndication>
                            <xades:CommitmentTypeId>
                                <xades:Identifier>http://uri.etsi.org/01903/v1.2.2#ProofOfOrigin
                                </xades:Identifier>
                                <xades:Description>Indicates that the signer recognizes 
                                 to have created, approved and sent the signed data object
                                </xades:Description>
                            </xades:CommitmentTypeId>
                            <xades:AllSignedDataObjects/>
                        </xades:CommitmentTypeIndication>
                    </xades:SignedDataObjectProperties>
                </xades:SignedProperties>
                <xades:UnsignedProperties>
                    <xades:UnsignedSignatureProperties>
                        <xades:SignatureTimeStamp>
                            <ds:CanonicalizationMethod Algorithm=
                             "http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
                            <xades:EncapsulatedTimeStamp>...</xades:EncapsulatedTimeStamp>
                        </xades:SignatureTimeStamp>
                    </xades:UnsignedSignatureProperties>
                </xades:UnsignedProperties>
            </xades:QualifyingProperties>
        </ds:Object>
    </ds:Signature>
</doc>

This is a valid XAdES-T message containing signed properties and a timestamp.

In the UnsignedAttributes element, more levels can be added (for example, the C level).

Using the Code

struct FILEREF
{
	const char* data = 0; // pointer to data
	DWORD sz = 0; // size, or 0 if null terminated XML
	const char* ref = 0;
	std::string mime = "application/octet-stream";
};

HRESULT XMLSign(LEVEL lev, std::vector<FILEREF>& data,const std::vector<CERT>& Certificates,
                SIGNPARAMETERS& Params, std::vector<char>& Signature);

where:

  • lev is a value from the LEVEL enum (XMLDSIG, B,T). If you use XMLDSIG, then the XML file is verifiable also with the Windows CryptXML API.
  • data contains the data to be signed. Each structure in the vector has:
    • A pointer to the bytes. If this is XML data and the signing mode is ENVELOPED, then this is a null terminated string and the second parameter (DWORD) is zero. In this case, the data is signed as canonicalized XML and returned in enveloped mode.
    • If the mode is detached, then the two parameters contain the pointer and size to the data, which is signed as raw.
    • The third parameter is the URI reference put in the signature. If this is an enveloped signature and the data is XML, this can be zero.
    • The fourth parameter is the MIME type of the content, by default, application/octet-stream.
  • Certificates contain the certificates to use for signing. If the mode is ENVELOPED, only one certificate is allowed.
  • Params is a structure that defines:
    • Attached from the ATTACHTYPE enum (DETACHED, ENVELOPING or ENVELOPED)
    • Hashing algorithm (default SHA-256, SHA-1 can also be specified)
    • Signing Policy
    • Timestamp parameters (URL, Policy, Nonce, Extensions)
    • The OID of a commitment type (1.2.840.113549.1.9.16.6.1 to 6)
  • Signature receives the signature.

If the mode is ENVELOPED, then the returned Signature is a single XMLElement which contains the original data and the enveloped signature.

If the mode is ENVELOPING with 1 certificate, then a single ds:Signature element is returned which contains the signature and a ds:Object element which contains the original data. If there are multiple certificates, multiple ds:Signature elements are returned in a <root> root element.

If the mode is DETACHED with 1 certificate, then a single ds:Signature element is returned. If there are multiple certificates, multiple ds:Signature elements are returned in a <root> root element.

For enveloped signatures, multiple signing is not possible for, when you add a signature to a XML element that already contains a ds:Signature element, the first signature will be rendered invalid (the hash of the content would change).

The XAdES file produced by my library validates as 100% correct at the ETSI conformance checker tools. :)

PAdES

Unlike CAdES or XAdES, PAdES does not define any new protocol for encryption, but it describes meta-information on how to sign a PDF file. In the PDF file, you can include either a CAdES format or a XAdES one inside the PDF as a detached signature. The levels of signing are similar to what we have seen so far (B, T, etc.) with a few exceptions, so at the moment, our library will be able to create B-B, B-T and B-LT signatures.

Here follows a simple Hello World, PDF file:

%PDF-1.7

1 0 obj  % entry point
<<
  /Type /Catalog
  /Pages 2 0 R
>>
endobj

2 0 obj
<<
  /Type /Pages
  /MediaBox [ 0 0 200 200 ]
  /Count 1
  /Kids [ 3 0 R ]
>>
endobj

3 0 obj
<<
  /Type /Page
  /Parent 2 0 R
  /Resources <<
    /Font <<
      /F1 4 0 R 
    >>
  >>
  /Contents 5 0 R
>>
endobj

4 0 obj
<<
  /Type /Font
  /Subtype /Type1
  /BaseFont /Times-Roman
>>
endobj

5 0 obj  % page content
<<
  /Length 44
>>
stream
BT
70 50 TD
/F1 12 Tf
(Hello, world!) Tj
ET
endstream
endobj

xref
0 6
0000000000 65535 f 
0000000010 00000 n 
0000000079 00000 n 
0000000173 00000 n 
0000000301 00000 n 
0000000380 00000 n 
trailer
<<
  /Size 6
  /Root 1 0 R
>>
startxref
492
%%EOF

My library includes a small and very experimental PDF parser which supports many simple PDF files. Much of the code has been compared with the results of jSignPDF. If you can't load a specific PDF file, let me know.

PDF Signatures

A PDF signature has the following properties:

  • It is always detached.
  • It is put inside a special object in the PDF file. The PDF file is first created with enough space to hold the signature, initially filled with zeroes.
22 0 obj
<</Contents <000000 .... 00>
/Type/Sig/SubFilter/ETSI.CAdES.detached/M(D:20181006080704+00'00')
/ByteRange [0 64944 124946 1312]/Filter/Adobe.PPKLite>>

The byterange parameter specifies the portion of the PDF file that is signed. Theoretically, you can sign any portion, but Adobe Reader rejects any signature unless the entire PDF file is signed. Therefore, the byte range is from the start to the '<' character before the 00s, and from the '>' character after the 00s to the end of file. This is the part that will be hashed.

If you put a standard CMS. then the marking is adbe.pkcs7.detached. If you put a CAdES level signature, the marking is ETSI.CAdES.detached.

The main difference between ordinary CAdES signatures and those that are put to the PDF file is that the signature must not contain a timestamp from the current clock, as this information is already put in the PDF file with the /M parameter. Therefore, the OID szOID_RSA_signingTime is not added to the signature when SIGNPARAMS.PAdES = true.

PDF Reconstruction

To add a signature, a new revision must be added to the PDF. Therefore my library:

  • creates a new root, a pointer to the signature created and a new xref table, mentioning the old revision
  • replaces info, contents, pages and kids objects with the the necessary structures to hold the PDF signature while still pointing to the old data
  • signs with CAdES

The following is the added revision after using my library to the above hello world PDF file:

42 0 obj
<</F 132/Type/Annot/Subtype/Widget/Rect[0 0 0 0]/FT/Sig/DR<<>>/T(Signature1)/
   V 40 0 R/P 7 0 R/AP<</N 41 0 R>>>>
endobj
40 0 obj
... signature
endobj
43 0 obj
<</BaseFont/Helvetica/Type/Font/Subtype/Type1/Encoding/WinAnsiEncoding/Name/Helv>>
endobj
44 0 obj
<</BaseFont/ZapfDingbats/Type/Font/Subtype/Type1/Name/ZaDb>>
endobj
41 0 obj
<</Type/XObject/Resources<</ProcSet [/PDF /Text /ImageB /ImageC /ImageI]>>/
Subtype/Form/BBox[0 0 0 0]/Matrix [1 0 0 1 0 0]/Length 8/FormType 1/Filter/FlateDecode>>stream
xœ     endstream
endobj
7 0 obj
<</Type/Page/MediaBox[0 0 595 842]/Rotate 0/Parent 3 0 R/Resources<</ProcSet[/PDF/Text]/
  ExtGState 22 0 R/Font 23 0 R>>/Contents 8 0 R/Annots[42 0 R]>>
endobj
3 0 obj
<</Type/Pages/Kids[7 0 R 

24 0 R
28 0 R
32 0 R]/Count 4/Rotate 0>>
endobj
1 0 obj
<</Type/Catalog/AcroForm<</Fields[ 42 0 R]/DR<</Font<</Helv 43 0 R/ZaDb 44 0 R>>>>/
   DA(/Helv 0 Tf 0 g )/SigFlags 3>>/Pages 3 0 R>>
endobj
2 0 obj
<</Producer(AdES Tools https://www.turboirc.com)/ModDate(D:20181009160112+00'00')>>
endobj
xref
0 4
0000000000 65535 f 
0000166315 00000 n 
0000166460 00000 n 
0000166230 00000 n 
7 1
0000166062 00000 n 
40 5
0000105528 00000 n 
0000165857 00000 n 
0000105400 00000 n 
0000165681 00000 n 
0000165780 00000 n 
trailer
<</Root 1 0 R/Prev 104519/Info 2 0 R>>
startxref
166559
%%EOF

XRef Streams

I won't go too far at this moment, since this article is about PAdES and not about PDF. However, some PDF files have their XRef, not as a plain-text entry but as an object which contains the xref table compressed with zlib. In that case, the XRef generated is an object:

465 0 obj
<</Type/XRef/Index [0 1 358 2 362 1 364 1 460 6 ]/W[1 4 2]/Root 364 0 R/Prev 116/Info 362 0 R/
Size 472/ID[<C570CC80F0638E5337E581345C7449FB><17DEE6C01B14632A778C3FC1D5297D97>]/Length 77/
Filter/FlateDecode>>stream
....
endstream
endobj

The format of this stream is beyond the scope of this article, but it contains the same data as the above text-only XRef.

LT Type

A PDF file cannot have parts that are unsigned. Therefore, putting the XL information (certificates and crls) to the CMS as unsigned attributes will not make our signature automatically XL compatible. We have to create a special dictionary, called DSS, which contains indirect references to the certificates and crls, and this is put before signing so it is also signed.

Using the Code

HRESULT PDFSign(LEVEL lev,const char* data,DWORD sz,const std::vector<CERT>
& Certificates, SIGNPARAMETERS& Params,std::vector<char>& Signature);

HRESULTERROR PDFVerify(const char* d, DWORD sz, std::vector<PDFVERIFY>& VerifyX);

The parameters in this function call are identical to the Sign() function in the CAdES article, except that you must pass PAdES = true to the SIGNPARAMETERS structure, and you must pass only one certificate. If you pass more, PAdES will work successfully but Adobe Reader is not able to read multiple certificates in one element (you must re-sign the PDF). Currently, the library works up to the XL level:

  • Pass LEVEL::CMS -> Sign normally (old method)
  • Pass LEVEL::B -> Sign PAdES B-B
  • Pass LEVEL::T -> Sign PAdES B-T
  • Pass LEVEL::XL -> Sign PAdES B-LT

The output is fully compliant with the ETSI verification tools.

My library can sign most PDF files. Password protected PDF files cannot be signed. Signing a PDF file which is already signed might cause incompatibilities. If you have issues, let me know,

Since it is CAdES, it supports multiple certificates. However, Adobe Reader will only show information about the last certificate found in the collection. While the ETSI tools will successfully validate such a PDF, Acrobat only supports a recursive signature: You sign the first PDF file, creating a new signed PDF, then you sign this new PDF to another new PDF. This means that the new signature will also sign the entire previous signature.

ASiC

ASiC files are ZIP files that contain raw data and digital signatures.

ASiC-S

The simple version of the container, named ASiC-S, can hold one document. This is a ZIP file which contains the following:

  • An optional mimetype file, which contains the mime type of the container, application/vnd.etsi.asic-s+zip.
  • The document to be signed. It can be any file, including another ASiC.
  • A META-INF folder, which contains:
    • Either a signatures.p7b, a detached CAdES signature on the document file, or,
    • A signatures.xml which contains a detached XAdES signature of the document

Because the signature is always detached, if the document to be signed is itself an XML file, there is no need to canonicalize it.

ASiC-E

The extended version of the container, named ASiC-E, can hold any number of documents. This is a ZIP file which contains the following:

  • An optional mimetype file, which contains the mime type of the container, application/vnd.etsi.asic-e+zip.
  • The documents to be signed. It can also put them in directories.
  • An ASiCManifest.xml file inside the META-INF folder:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<ASiCManifest xmlns:ns2="http://www.w3.org/2000/09/xmldsig#" 



xmlns="http://uri.etsi.org/02918/v1.2.1#">
    <SigReference MimeType="application/x-pkcs7-signature" 



    URI="META-INF/signature.p7s"/>
    <DataObjectReference URI="file1.txt">
        <ns2:DigestMethod Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
        <ns2:DigestValue>...</ns2:DigestValue>
    </DataObjectReference>
    <DataObjectReference URI="test/hello2.xml">
        <ns2:DigestMethod Algorithm="http://www.w3.org/2001/04/xmlenc#sha256"/>
        <ns2:DigestValue>...</ns2:DigestValue>
    </DataObjectReference>
</ASiCManifest>

This file contains references to all the files inside the container (in the above example, to file1.txt and to hello2.xml inside the folder named test.

  • signatures.xml, signatures1.xml, signatures2.xml, etc. or signatures.p7s, signatures1.p7s which reference all or parts of the manifest file and sign them. There can be also other manifest files (ASiCManifest1.xml, etc.) which reference a different set of files.

The Code

HRESULT ASiC(ALEVEL alev,ATYPE typ, 
 LEVEL lev, std::vector<std::tuple<const BYTE*,DWORD,const char*>>& data,
            std::vector<CERT>& Certificates, SIGNPARAMETERS& Params,
 std::vector<char>& fndata);

where:

  • alev is the container mode, either S or E
  • typ is the signing mode, either CAdES or XAdES
  • The rest of the parameters are passed to the CAdES and XAdES functions, check the relative articles for a full description
  • fndata receives the container zip data

MIME

ASiC is interesting, but many existing applications support MIME. Using my MIME library, you can now put multiple files inside a MIME container which is now signed with CAdES and, with one of my own experimental functions, with XAdES.

HTML

To bring it further, I've created enveloped signatures in HTML. HTML cannot be canonicalized easily, so I've injected the signature between the <html> and the next tag. The file is parsed as binary, and the result is a XAdES-XL signature. Whether browsers will like my implementation in the future - who knows?

Portable Executable (Experimental Support yet)

Since Windows PE executables can be digitally signed with a PKCS#7, we can use CAdES to sign it. Since signtool.exe is now useless, we have to parse the PE file ourselves:

  • Parse the file with my PE class
  • Hash the part of the file to be hashed:
    • Beginning up to checksum
    • After the checksum and up to the certificate entry table
    • After the certificate entry table to the end of the headers
    • All the sections, sorted
    • Extra data that may appear after the sections
  • Sign that part with CAdES
  • Update the certificate entry table
  • Append the detached signature to the end of the file.
HRESULT PESign(LEVEL levx, const char* d, DWORD sz, const std::vector<CERT>& Certificates, 
               SIGNPARAMETERS& Params, std::vector<char>& res);

A Few Last Words...

I guess you will need me if you live in the European Union anytime soon. But until then...

GOOD LUCK.

Acknowledgements

History

  • 22nd July, 2019: PAdES Verification, XAdES-B verification, article join, EXE stuff added
  • 15th October, 2018: Better PDF parser, PAdES B-T, PAdES B-LT
  • 14th October, 2018: Changed parameters
  • 9th October, 2018: Enhanced PDF parser, 100% ETSI compliance
  • 3rd October, 2018: Work on PAdES, ETSI bugs
  • 23rd September, 2018: Added multiple certificate support
  • 22nd September, 2018: Added enveloping mode
  • 15th September, 2018: Added CAdES-XL Type 2
  • 14th September, 2018: Parameter updating
  • 14th September, 2018: Added CAdES-C and CAdES-X Type 2
  • 12th September, 2018: Canonicalization info, ETSI tools
  • 1st September, 2018: Added commitment types
  • 31st August, 2018: Added tech info about CMS
  • 28th August, 2018: Typos fixed
  • 19th August, 2018: First release

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here