Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Cryptographic Interoperability: Digital Signatures

4.98/5 (48 votes)
20 Oct 2009CPOL21 min read 1   12.5K  
Sign and verify messages using Crypto++, Java, and C#.

Introduction

The Crypto++ mailing list occasionally receives questions regarding creating and verifying digital signatures among various libraries. This article will examine signing and verifying messages between Crypto++, C# and Java. In addition, the C# sample presents AsnKeyBuilder and AsnKeyParser, which allows us to serialize and reconstruct keys in PKCS#8 and X.509. This frees us from the CLR's limitation of XML serialization using the irregular format of RFC 3275. RFC 3275 specifies XML-Signature Syntax and Processing [13]. Sections 4.4.1 and 4.4.2 specify definitions related to DSA (and RSA) parameters such as the KeyInfo element and KeyValue element.

The Digital Signature Algorithm will be used as the test case. There are a few reasons for this choice. First is popularity. Second, as we will see below, different signatures are created for the same key and message due to a per-message random variable. Next, DSA signatures are represented in at least three different formats, which causes necessary conversions. Finally, we will use strings and streams rather than byte arrays, which adds more interoperability issues.

Below, we will see that a signed message is the tuple { message, signature }. When we verify a message, we require the message, the signature, and the signer's public key. This brings to light two problem areas. The first issue is keys and their exchange. The second is defining what exactly will be signed and later verified.

The first issue was examined in Cryptographic Interoperability: Keys [1]. The key interoperability article discusses importing and exporting public and private keys in Crypto++, Java, and C# in a portable manner using PKCS#8 and X.509.

This article will examine the second issue — understanding what will be (or has been) signed. As with the previous article, we examine the details of the process so that when things go wrong, we can understand why and then correct the issue. Topics to be visited in this article are as follows. Though the impact of strings and streams appear early, we visit the topic last.

  • Digital Signatures
    • Key Generation
    • Message Signing
    • Message Verification
  • Signature Formats
    • IEEE P1363
    • DER Encoding
    • OpenPGP
  • Generating Keys, Signing, and Verifying
    • Crypto++
    • Java
    • C#
  • Strings and Streams
    • Crypto++
    • Java
    • C#

Our examples will use the Digital Signature Standard specified in FIPS 186-2 [11]. The standard prescribes three approved signature schemes. We will use the Digital Signature (DS) Algorithm as opposed to RSA digital signature algorithm (RSASS) or Elliptic Curve digital signature algorithm (ECDSA).

FIPS 186-2 specifies the use of a 1024 bit p, a 160 bit q, and SHA-1 as the hash. FIPS 186-3 [2] uses larger hashes (SHA-2), larger values for p (up to 3072 bits), and larger values for q (up to 256 bits). FIPS 186-3 is currently in draft status.

Downloads

There are three downloads which are available. Each archive is the project for creating and verifying signatures. For those who only want the source code, Table 1 identifies the download of interest.

FilenameLanguage
CryptoPPInteropSign.zipC++/Crypto++
JavaInteropSign.zipJava
CSInteropSign.zipC#
Table 1: Source Code Archives

Digital Signatures

A digital signature is the electronic equivalent of a hand written signature. It uses a public and private key pair for its operations. The signer signs the message using the private key, and the verifier confirms the signature on the message using the public key.

The DSA is a special case of the ElGamal signature system [12]. The security of DSA is derived from discrete logarithms. There are actually two instance problems: the first is logarithms in the multiplicative group Zp, for which the index-calculus method applies. The second is the logarithm problem in the cyclic subgroup q, where current methods run in square root time.

DSA is a Signature Scheme with Appendix. This means the that the message must be presented to the verifier function. This is in contrast to a Signature Scheme with Recovery. In a recovery system, the message is folded into the signature, so the message does not have to be sent with the signature. The verification routine will extract the message from the signature in a recovery system.

Key Generation

A DSA key is generated as follows [12]. Below, the size of q is fixed by FIPS 186 at 160 bits. Though the original FIPS 186 specification [7] specifies p between 512 to 1024 bits inclusive, FIPS 186-2 [11] fixes p at 1024. This means that some libraries enforce a bit size of 1024 at step three.

  1. Select a prime number q such that 2159 < q < 2160
  2. Choose t so that 0 ≤ t ≤ 8
  3. Select a prime number p such that 2511+64t < p < 2512+64t with the additional property that q divides (p-1)
  4. Select a generator α of the unique cyclic group of order q in Z*p
  5. To compute α, select an element g in Z*p and compute g(p-1)/q mod p
  6. If α = 1, perform step five again with a different g
  7. Select a random a such that 1 ≤ aq-1
  8. Compute y = αa mod p

The public key is (p, q, α, y). The private key is a. We usually encounter the private key specified as x.

Message Signing

To sign a document of arbitrary size using an appendix scheme, two steps occur:

  • hash the document
  • decrypt the hash of the document as if it were an instance of ciphertext using the private key

In DSA, the details of signing the binary message m (document) of arbitrary length are as follows [12]. Notice that we are signing a binary message (there is no notion of a string at this level), and the message can be any length. Because the message can be any length, the message is digested with a hash function — h(m).

  1. Generate a random per-message value k such that 0 < k < q
  2. Compute r = (αk mod p) mod q
  3. If r = 0, perform step one again with a different k
  4. Compute k-1 mod q
  5. Calculate s = k-1{h(m) + ar} mod q
  6. If s = 0, perform step one again with a different k

The signature on m is (r, s). Message m and (r, s) should be sent to the verifier. We need to observe that both r and s are 20 bytes, since a modular reduction is being performed (steps 2 and 5) using q, a 160 bit value. This will gain significance later when we begin verifying messages between Crypto++ and C# (which use the IEEE P1363 signature format) and Java (which uses a DER encoding of a signature).

Message Verification

To verify a document of arbitrary size using an appendix scheme, three steps occur:

  • hash the document
  • encrypt the previously generated document hash (from step 2 of Message Signing process) using the signer's public key
  • verify the recovered hash from step one of the Message Verification process matches the calculated hash from step two of the Message Verification process

The short story of the above is we are comparing our calculated hash of the document with the signer's calculated hash of the document after we remove the signer's encryption operation. The DSA details are as follows [12]. Below, recall that (r, s) is the signature on binary message m, with h(m) digesting the arbitrary length message.

  1. Obtain the public key (p, q, α, y)
  2. Verify 0 < r < q and 0 < s < q (reject the signature otherwise)
  3. Compute w = s-1 mod q
  4. Compute u1 = w•h(m) mod q
  5. Compute u2 = rw mod q
  6. Compute v = (αu1yu2 mod p) mod q

The signature is valid if and only if v = r.

Signature Formats

For those of us who have followed Cryptographic Interoperability: Keys [1], we are not yet finished with standards and formats. There are three formats which Crypto++ supports, with IEEE P1363 being native to the library. The remaining two formats are DER encoding and OpenPGP. If we receive a format other than P1363, we would use Crypto++'s DSAConvertSignatureFormat to convert signature (r, s) to the P1363 format.

Recall the signature on m is (r, s). From our exploration of Message Signing, recall that q is 160 bits. Both r and s are a residue of a modular reduction using q, so each is 160 bits (20 bytes).

IEEE P1363

Both Crypto++ and C# use the format described in IEEE P1363 [9]. The P1363 signature is a concatenation of r and s, denoted r || s. The concatenation results in a signature that is exactly 40 bytes in length.

DER Encoding

Java uses DER encoding of (r, s). According to the Java Cryptography Architecture API Specification & Reference [8], the syntax of the signature is as follows. The Java signature is consistent with DSS-Sig-Value of RFC 3279, Algorithms and Identifiers for the Internet X.509 Public Key Infrastructure [14]. Refer to Section 2.2.2, DSA Signature Algorithm.

SEQUENCE ::= {
  r INTEGER,
  s INTEGER }

It does not appear we can request any other format from Java. This will only be a minor inconvenience in Crypto++, since the Crypto++ library offers a conversion routine.

In C#, we will need to convert the format from DER to P1363. To create a DSASignatureConverter class for C#, examine the code for AsnKeyParser. There is not much to the converter — it is a well defined structure. Call NextSequence to remove the outer sequence, and then return the concatenation of the two parsed integers r and s. Before returning r || s, verify each is 20 bytes in length (or adjust accordingly). See the discussion below at CryptoInteropSign.aspx?msg=3240277#xx3240277xx.

OpenPGP

OpenPGP is specified in RFC 2440, "The OpenPGP Message Format" [10]. OpenPGP uses Signature Packets to represent a signature on a message. In the case of DSA, these are the two MPI (multiprecision integers) r and s. Section 5.2.2 specifies the Version 3 Signature Packet Format while Section 5.2.3 specifies the Version 4 Signature Packet Format. Again, the Crypto++ library offers a conversion routine.

Generating Keys, Signing, and Verifying

This section will examine the signing and verification process. Generating keys was visited in key interoperability, so we will focus on what is required for the case of DSA. We will also detail Crypto++ since it is not documented as well as Java and C#. Finally, to achieve interoperability, we will apply the cryptographic transformations to byte arrays of a string encoded using UTF-8. Our message will be the wide string 'Crypto Interop: \u9aa8', which is shown in Figure 1.

The Message to be Signed
Figure 1: Message to be Signed

Crypto++

Key Generation

To generate a DSA key for signing messages, we perform the following in Crypto++. Though we can generate the key using the DSA::Signer constructor, we chose to defer so that we can exercise the library. After we generate the key, we save it through the overridden Save of class PKCS8PrivateKey. There are no copy constructors for the PrivateKey and PublicKey classes, so the calls to AccessPrivateKey and AccessPublicKey receive a reference.

C++
DSA::Signer signer;
PrivateKey& privateKey = signer.AccessPrivateKey();

privateKey.GenerateRandom( prng );
privateKey.Save(FileSink("private.dsa.cpp.key"));

We then construct a verifier object. We do this so we can access the public key of the pair. Unfortunately, we cannot access it through the private key. We then save it using the overridden X509PublicKey::Save.

C++
DSA::Verifier verifier( signer );
PublicKey& publicKey = verifier.AccessPublicKey();
publicKey.Save(FileSink("public.dsa.cpp.key"));

Message Signing

We sign our message with Crypto++ as follows. We start with a wide string. The message is then converted to a UTF-8 string and stored in narrow.

C++
// Crypto++ Load Private Key
DSA::Signer signer;
...

// Convert Wide String to UTF-8
wstring wide = L"Crypto Interop: \u9aa8";
string narrow;

WideCharToMultiByte( UTF8, ... );
...

const byte* data = narrow.c_str();
int length = narrow.length();

// Set up for SignMessage
byte* signature = new byte[ signer.MaxSignatureLength() ];

// PGP RandPool
AutoSeededRandomPool prng;
size_t length = signer.SignMessage( prng, data, length, signature );

After we convert the wide string to UTF8 using WideCharToMultiByte, we set up a buffer (signature) to hold the signature. The signature will be at most MaxSignatureLength bytes. Crypto++ returns 0x28 (40 bytes) as the maximum DSA signature length. We expect this since the signature (r, s) is a concatenation of two 160 bit residues.

Finally, we call SignMessage on the signer object. SignMessage requires a pseudo random source due to the per message variable k. The function returns the actual length of the signature in bytes, which is also MaxSignatureLength. Next, we save both the message (which we signed) and the signature to files. Note that we cannot presume the original string (L"Crypto Interop: \u9aa8") will be what is actually signed after compilation and string and stream construction.

C++
// mfs: message filestream
// sfs: signature filestream
ofstream mfs, sfs;
mfs.open("dsa.cpp.msg", ios_base::binary );
sfs.open("dsa.cpp.sig", ios_base::binary );

// Save Message which was Signed
mfs.write( narrow.c_str(), narrow.length() );

// Save Signature on Message
sfs.write( (const char*)signature, length );

In Figure 2, we examine the contents of the message in file out.cpp.msg. We see the regular UTF-8 compression on the string, except for the last Han character which expands to three bytes.

Message File Contents
Figure 2: Message File Contents

We next examine the results of creating multiple signatures on the same message and the contents of the file dsa.cpp.sig. We run the routine twice using the same private key and compare the results side by side in Figure 3. If we recall the Message Signing process, we were required to select a random per-message value k. Because k is random, the algorithm produces different signatures on the same message.

Signatures on Message, Identical Messages
Figure 3: Different Signatures on Message due to Random k

It is important that we make the distinction that in Figure 2, dsa.cpp.msg is the message that we signed, and not the original string. When Java or C# verifies our Crypto++ message, they will verify the bytes in this file, and then reconstruct the original string.

Message Signing (DER Encoded)

Should our message and signature require DER encoding for systems such as Java, we perform the following. Below, the process is examined after the signing process and before we write the signature to disk.

C++
// Determine size of required buffer
length = DSAConvertSignatureFormat( NULL, 0, DSA_DER,
    signature.c_str(),signature.length, DSA_P1363 );

// A buffer for the conversion
byte* buffer = new byte[ length ];

// We are P1363 format. Java desires DER encoding
length = DSAConvertSignatureFormat( buffer, length, DSA_DER, 
    signature.c_str(), signature.length(), DSA_P1363 );

Message Verification

The verification process will abandon the standard library's streams in favor of a Crypto++ FileSource. A FileSource will place the contents of a file in a std::string. In Crypto++'s usage below, a string is similar to a vector — it is a collection of bytes. Crypto++ uses a Unix pipeline paradigm. The key is the source, so we need a destination — the FileSink.

C++
// std::string used as a byte array
string message, signature;

FileSource( "dsa.cpp.msg", true, new StringSink( message ) );
FileSource( "dsa.cpp.sig", true, new StringSink( signature ) );

Next we then verify the message. Recall that Crypto++ is bytes in, bytes out — hence the reason for a const byte* cast.

C++
bool result = verifier.VerifyMessage(
    (const byte*)message.c_str(), message.length(),
    (const byte*)signature.c_str(), signature.length() );

And finally, the conversion back to a wide string, the results of which are shown in Figure 4.

Image 4
Figure 4: Message Verification and Conversion

Message Verification (DER Encoded)

Recall that Java DER encodes the signature (r, s) on m. When we receive a DER encoded signature from Java, we perform the following.

C++
FileSource( "dsa.java.msg", true, new StringSink( message ) );
FileSource( "dsa.java.sig", true, new StringSink( signature ) );

// First, a buffer for the conversion
size_t length = verifier.SignatureLength();
byte* buffer = new byte[ length ];

// DER encoded from Java. We desire P1363 format
length = DSAConvertSignatureFormat( buffer, length,
    DSA_P1363, signature.c_str(), signature.length(), DSA_DER );

// Reinitialize signature so that it can be used
//   in the verifier below with minimal effort
signature = string( (const char*)buffer, length );
delete[] buffer;

// Verify the Signature on the Message
bool result = verifier.VerifyMessage(
   (const byte*)message.c_str(), message.length(),
   (const byte*)signature.c_str(), signature.length() );

Java

Java enjoys greater popularity with better documentation, so the following is presented for completeness. The Java Cryptography Extension (JCE) Reference Guide [8] answers most questions.

Key Generation

Our code to create a DSA key pair in Java is as follows. At the completion of the routine, we would serialize the keys for future use using getBytes. getBytes returns the default format for the object, which either a PKCS#8 or X.509 message.

C++
KeyPairGenerator kpg = KeyPairGenerator.getInstance("DSA");

kpg.initialize(1024, new SecureRandom());
KeyPair keys = kpg.generateKeyPair();

PrivateKey privateKey = keys.getPrivate();
PublicKey publicKey = keys.getPublic();

Message Signing

To sign a message using our generated keys, we perform the following.

C++
// Retrieve the Private Key
PrivateKey privateKey = LoadPrivateKey("private.dsa.java.key");

// Create the signer object
Signature signer = Signature.getInstance("DSA");
signer.initSign(privateKey, new SecureRandom());

// Prepare the Message
String s = "Crypto Interop: \u9aa8";

// Save the binary of the String which we will sign
byte[] message = s.getBytes("UTF-8");

// Sign the message
signer.update(message);
byte[] signature = signer.sign();

We then save the byte arrays message and signature to disk for verification by other libraries.

Message Verification

Verifying a message is as follows. Below, we verify a message generated in C#.

C++
// Load the public
PublicKey publicKey = LoadPublicKey("public.dsa.cs.key");

// Load the message from file
byte[] message = LoadMessageFile("dsa.cs.msg");    

// Load the signature on the message from file
byte[] signature = LoadSignatureFile("dsa.cs.sig");

// Initialize Signature Object
Signature verifier = Signature.getInstance("DSA");
verifier.initVerify(publicKey);

// Load the message into the verifier
verifier.update(message);

// Verify the Signature on the Message
boolean result = verifier.verify(signature);

Unlike Crypto++ and C#, the Java code expects the signature (r, s) on the message m to be in DER encoded format. Attempting to verify a P1363 signature results in an encoding exception. As a workaround, our Crypto++ and C# source code will DER encode the signature for Java.

C#

Key Generation

Cryptographic Interoperability: Keys [1] is a fairly comprehensive treatment of generating, loading and saving keys, so we will only revisit the basics. Below we create a key pair for use in C#.

C++
CspParameters csp = new CspParameters();
csp.KeyContainerName = "DSA Test (OK to Delete)";

csp.ProviderType = PROV_DSS_DH;    // 13
csp.KeyNumber = AT_SIGNATURE;      // 2

DSACryptoServiceProvider dsa = new DSACryptoServiceProvider(1024, csp);

// Keys
DSAParameters privateKey = dsa.ExportParameters(true);
DSAParameters publicKey = dsa.ExportParameters(false);

Since we used the DSACryptoServiceProvider which accepts a CSP and an integer bit count, the constructor created a key pair for us. We also specify PROV_DSS_DH and AT_SIGNATURE per MSDN (we could actually specify PROV_DSS if inclined). Finally, we export the keys by calling ExportParameters. We then use the AsnKeyBuilder class to convert the DSAParameters to a PKCS#8 or X.509 encoded key for serialization.

Message Signing

To sign a message, we perform the following. dsa is the service provider from above. Recall that the signature is in P1363 format, so the DSASignatureFormatter performs a concatenation of r and s, the two 20 byte arrays.

C++
DSASignatureFormatter signer = new DSASignatureFormatter(dsa);

//Set the hash algorithm to SHA1.
signer.SetHashAlgorithm("SHA1");

String m = "Crypto Interop: \u9aa8";
Encoding e = Encoding.GetEncoding("UTF-8");
byte[] message = e.GetBytes(m);

// Hash the Message
SHA1 sha = new SHA1CryptoServiceProvider();
byte[] hash = sha.ComputeHash(message);

// Create the Signature for h(m)
byte[] signature = signer.CreateSignature(hash);

We would then serialize the message m and the signature (r, s) on the message m.

Message Verification

For the details of loading a DSA key, please see Cryptographic Interoperability: Keys [1]. We reconstruct a public or private key using LoadPublicKey and LoadPrivateKey of the AsnKeyParser as follows. The AsnKeyParser handles the heavy lifting of parsing a key described using ASN.1 syntax. Be prepared to catch a BerDecodeError if the key is malformed.

C++
AsnKeyParser keyParser = new AsnKeyParser("public.dsa.cs.key");
DSAParameters publicKey = keyParser.ParseDSAPublicKey();

Next we move on to opening the container. In this case, the DSACryptoServiceProvider uses a constructor which accepts only the CSP (as opposed to a CSP and integer bit count). This indicates to the provider that we do not want a key pair generated. Note that we use PROV_DSS rather than PROV_DSS_DH because we no longer have parameters such as J and the seed.

C++
CspParameters csp = new CspParameters();
csp.KeyContainerName = "DSA Test (OK to Delete)";

csp.ProviderType = PROV_DSS;      // 3
csp.KeyNumber = AT_SIGNATURE;     // 2
// Load key into provider
DSACryptoServiceProvider dsa = new DSACryptoServiceProvider(csp);

dsa.ImportParameters(publicKey);

Once the provider accepts our parameters at the call to ImportParameters, the exercise becomes academic. Figure 5 shows a private key which has been reconstructed using AsnKeyParser. Note the missing seed and parameter J (the group factor). This is due to PKCS#8 and X.509 — there is no specification for serializing the parameters.

PKCS#8 Key Parameters
Figure 5: PKCS#8 Private Key Parameters

Below, we read the byte[] arrays which constitute the message and signature.

C++
byte[] message = LoadMessage();
byte[] signature = LoadSignature();

Finally, the code to verify a signature. Interestingly, DSASignatureDeformatter does not accept a hash object. We have to provide a string describing our choice to SetHashAlgorithm.

C++
SHA1 sha = new SHA1CryptoServiceProvider();
byte[] hash = sha.ComputeHash(message);

// Verifier
DSASignatureDeformatter verifier = new DSASignatureDeformatter(dsa);
verifier.SetHashAlgorithm("SHA1");

bool result = verifier.VerifySignature(hash, signature);

Be aware that C# can throw a CryptographicException stating 'Length of the DSA signature was not 40 bytes.' We expect this from Java since Java uses a DER encoding while C# uses the P1363 format.

DER Encoded Signature
Figure 6: DER Encoded Signature Exception

In C#, we will need to convert the format from DER to P1363. To create a DSASignatureConverter class for C#, examine the code for AsnKeyParser. Call NextSequence to remove the outer sequence, and then return the concatenation of the two parsed integers r and s. Before returning r || s, verify each is 20 bytes in length (or adjust accordingly). See the discussion below at CryptoInteropSign.aspx?msg=3240277#xx3240277xx.

If we receive a cryptographic exception when exiting Main stating 'Keyset does not exist,' we should explicitly dispose of the container. Multiple methods in the sample opens a container named 'DSA Test (OK to Delete)', and each method sets PersistKeyInCsp = false. When garbage collection occurs, each managed object attempts to delete the shared native resource. To avoid the situation, we must finalize the object by calling Dispose, Close, or Clear in the method which opened the resource, which is usually not recommended.

Keyset Does Not Exist Exception
Figure 7: Keyset Does Not Exist Exception

Strings and Streams

Though strings and streams are a great convenience, they create the most problems for us when signing and verifying messages. This is because strings are simply encodings that ultimately become byte arrays, which then have a cryptographic transformation applied. Inconsistencies are usually introduced in one of two places when we convert from a string to a byte array.

The first can be introduced when a stream is allowed to choose an encoding for the conversion of a string. The second is introduced when programmers (one implementing the signer, the other implementing the verifier) select different encoding conversions for the same string. Note that this situation does not arise when we explicitly use byte arrays.

Since we will use strings at times, we need to decide what type of encoding to use. To this end, the Unicode Consortium recommends UTF-8 for data exchange [3]. Since nearly every major library supports UTF-8, we will use it through out for consistent results. Our choice of UTF-8 is a compromise between interoperability, compression, and channel efficiency. We should also be aware that there are other Unicode character sets (for example, SCSU and BOCU-1) that are more efficient for storage and data exchange [4,5].

The Consortium also defines how conversion occurs between character sets such as UTF-7 and UTF-8. The conversion algorithms are implemented in Windows functions WideCharToMultiByte and MultiByteToWideChar when using a code page of CP_UTF8 [6].

Crypto++

Crypto++ is agnostic with respect to strings and streams. For Crypto++, it is bytes in and bytes out. Unlike Java and C#, there is no method which takes a high level string. However, the C++ standard library does effect a string when using a stream.

We already know that Visual Studio uses a UTF-16 encoding. In the following, we will explore the effects of a stream on the string in Visual C++. For our first example, consider the program listed below.

C++
wstring ws = L"crypto";
wofstream ofs;

ofs.open("out.cpp.bin", ios_base::binary);
ofs << ws;

ofs.close();

When we examine its file output as in Figure 8, we see that a conversion has taken place despite the fact that we are using wide versions of the standard library and specified binary mode.

Wide Stream Binary Output
Figure 8: Wide Stream Binary Output

To investigate further, we specify the Han character for bone which is U+9AA8. We could use a European code point, but we may as well hit the topic hard. Unfortunately, the standard C++ library has completely failed us in this case. In Figure 9 below, Visual Studio IntelliSense correctly displays the character, while the standard library produces an empty file. In fact, removing binary mode still produces the result. This is a known issue with Microsoft's stream class.

Failed Wide Stream Binary Output
Figure 9: Failed Wide Stream Binary Output

To work around this issue, we have two choices. The first workaround involves iterating over the characters of the wide string, while writing them individually to the stream as shown below. The net effect is that we are writing a UTF-16 stream. Depending on how we chose to output the bytes, we achieve either a big endian (UTF-16BE) or little endian (UTF-16LE) stream. The result of the single Asian character is shown in Figure 10.

C++
wstring::const_iterator it = ws.begin();
for( ; it != ws.end; it++ )
{
    // Little Endian       
    ofs.put( (*it & 0x00FF) );
    ofs.put( (*it & 0xFF00)>>8 )
}
UTF-16LE Output
Figure 10: UTF-16LE Output

In the second, we use WideCharToMultiByte [7] and a multibyte stream. The net effect is that we are writing a UTF-8 stream. The corrected program is shown below. Figure 11 displays the output, which is the UTF-8 encoded wide string - 0xE9 0xAA 0xA8.

C++
wstring ws = L"\u9aa8";
char* utf = NULL;

// UTF-8 Encode
int nChars = WideCharToMultiByte( CP_UTF8, 0, ws.c_str(), -1, NULL, 0, NULL, FALSE );
utf = new char[ nChars ];
WideCharToMultiByte( CP_UTF8, 0, ws.c_str(), -1, utf, nChars, NULL, FALSE );

ofstream ofs;
ofs << utf;
...
UTF-8 Output
Figure 11: UTF-8 Output

Java

We will now explore what occurs in Java. We have two cases to examine: writing the string using the stream's writeUTF method; and the effects of calling getBytes on the string. First we examine writeUTF below.

C++
DataOutputStream dos = new DataOutputStream( new FileOutpuStream("out.java.bin"));

String s = "crypto";

dos.writeUTF(s);

In Figure 12, we see that the DataOutputStream's method prefixed the encoding with a 16 bit length. For interoperability, this is probably a poor choice.

Java DataOutputStream Output
Figure 12: Java DataOutputStream Output

Next, we modify the Java program to explore the various byte arrays returned from getBytes.

C++
DataOutputStream dos = new DataOutputStream( new FileOutputStream("out.java.bin"));

String s = "crypto";
byte[] b = s.getBytes();

dos.write(b, 0, b.length);

Using getBytes, our output is the same as Figure 13 — the string has been encoded in the platform's default character set. Next we run the program requesting a UTF-16 encoding:

C++
byte[] b = s.getBytes("UTF16");

In Figure 13, we see that we have a big endian array with a byte order mark. Again, for interoperability, this is probably a poor choice.

Java getBytes(UTF16) Output
Figure 13: Java getBytes("UTF16") Output

When we run the Java program using getBytes("UTF8") and getBytes("UTF32"), we find that no byte order mark is written to the stream. In the case of UTF-32, the array is again big endian. We observe expected (and desired) results using UTF-8.

C#

Our first C# example examines the result of using a default encoding when writing a string with a StreamWriter. The program is shown below, while its output is shown in Figure 14.

C++
using (TextWriter writer = new StreamWriter("out.cs.bin"))
{
    String s = "crypto";

    writer.Write(s);
}
C# StreamWriter Output
Figure 14: C# StreamWriter Output

As with Java, we observe the stream's use of a default encoding which is UTF-8. Next we modify our program to use a UTF-16 encoding.

C++
using (BinaryWriter writer = new BinaryWriter(
    new FileStream("out.cs.bin", FileMode.Create, FileAccess.ReadWrite)))
{
  String s = "crypto";
  Encoding e = Encoding.GetEncoding("UTF-16");

  byte[] b = e.GetBytes(s);

  writer.Write(b);
}

The results are shown in Figure 15. We observe that a little endian stream is written. Additionally, unlike Java, there is no byte order mark.

C# BinaryWriter Output
Figure 15: C# BinaryWriter Output

As with Java, we have to contend with byte order when using UTF-16. This again leads us UTF-8, which does not have byte order and byte order mark issues.

Acknowledgements

  • Wei Dai for Crypto++ and his invaluable help on the Crypto++ mailing list
  • Dr. A. Brooke Stephens who laid my Cryptographic foundations

Checksums

  • CryptoPPInteropSign.zip
    • MD5: 2647DE3E5E06A07F8CD05F911D75DC3B
    • SHA-1: 74DBED5386D64C9041EE66CFCF884C79F8961B0C
  • JavaInteropSign.zip
    • MD5: BA74E602379395177681172EFC73591E
    • SHA-1: B38D7755F6D6D9D4C6ADBE698EF77B771236AB4C
  • CSInteropSign.zip
    • MD5: 20A78F6E7817F523923CE2E0B21E95E9
    • SHA-1: 2D69E88935B549993D974C8A41D0091BE3547C5A

References

[1] J. Walton, Cryptographic Interoperability: Keys, April 2008, CryptoInteropKeys.aspx.

[2] FIPS 186-3 Draft, Digital Signature Standard, http://csrc.nist.gov/publications/drafts/fips_186-3/Draft-FIPS-186-3%20_March2006.pdf.

[3] Unicode Consortium, UTF-16 for Processing, Unicode Technical Note #12, http://unicode.org/notes/tn12/.

[4] Unicode Consortium, A Survey of Unicode Compression, Unicode Technical Note #14, http://unicode.org/notes/tn14/.

[5] Unicode Consortium, Fast Compression Algorithm for Unicode Text, Unicode Technical Note #31, http://unicode.org/notes/tn31/.

[6] D. Schmitt, International Programming for Microsoft Windows, Microsoft Press, ISBN 1-5723-1956-9.

[7] MSDN, WideCharToMultiByte, http://msdn2.microsoft.com/en-us/library/ms776420(VS.85).aspx.

[8] Java Cryptography Architecture(JCA) Reference Guide, http://java.sun.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html.

[9] IEEE P1363, Standard Specifications For Public-Key Cryptography.

[10] RFC 2440, OpenPGP Message Format, November 1998, http://www.ietf.org/rfc/rfc2440.txt.

[11] FIPS 186-2, Digital Signature Standard, January 2007, http://csrc.nist.gov/publications/fips/fips186-2/fips186-2-change1.pdf.

[12] A. Menenzes, et al., Handbook of Applied Cryptography, CRC Press, ISBN 0-8493-8523-7, pp. 451-2.

[13] RFC 3275, XML-Signature Syntax and Processing, March 2002, http://www.ietf.org/rfc/rfc3275.txt.

[14] RFC 3279, Algorithms and Identifiers for the Internet X.509 Public Key Infrastructure, April 2002, http://www.ietf.org/rfc/rfc3279.txt.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)