Introduction
This article represents a major code update from the initial release, based largely on readers suggestions.
I have removed the Win32 calls in CalcMD5FromFile()
and replaced them with UNIX compatible fopen(), fread()
, etc. I believe the code is now fully portable.
This update replaces the previous encryption scheme with a Rinjdael block cipher implementation - it wraps a class by Szymon Stefanek. The Rinjdael implementation I am using is a 128-bit CBC block cipher, with a 256-bit key. The DCipher
class wraps this in a stream cipher class, which encrypts files in data chunks. You can set the chunk size by calling SetChunkSize(int numKB)
before encrypting. If you don't set the size it will use a default 65 kb chunk size. There are a few items to note here:
- that the larger the chunk size the more secure the encryption -- ideally you would like to encrypt a file in a single data chunk to take full advantage of Rijndael's block chaining functionality,
- however, you must ensure adequate available memory, keeping in mind that input and output memory blocks must be allocated, so to use a 100K data chunk, you must have 200K + 16 bytes memory available. While the heap can use virtual memory,
DCipher
validates chunk size against available physical memory to prevent sensitive data from being written to disk.
- Finally, you must use the same chunk size for decryption that you used for encryption or data will be corrupted.
I've also changed the interface: the class methods have been modified to accept plain-text passwords. The plain-text password is then converted by the class to a 256-bit hex Rijndael session key.
Also, you no longer initialize the session in the constructor; everything is passed directly to the functions. To Encrypt/Decrypt you simply pass a full filepath, a password, and a 19-byte integer array. See the excerpt below:
int EncryptFile(const char *src, const char* pwd,
int head[19]);
int DecryptFile(const char *src, const char* pwd,
int head[19]);
char* GetSignature(const char *file);
void SetChunkSize(int numKB);
char* CalcMD5FromString(const char *s8_Input);
char* CalcMD5FromFile (const char *s8_Path);
The integer array allows you to give multiple apps unique signatures so that if you are using app-generated keys, you don't have an app corrupting another apps file. It also allows you to give individual files unique signatures (e.g.. timestamp - see demo app code for example), if you want to use file-hashing to check for modification.
The GetSignature()
method was added to assist in this regard. You could store a file signature and its corresponding plaintext file hash and its encrypted file hash in a database. You could then get its signature as a look-up key, or you could get its file hash to use as the look-up key. Either way you could verify if the file was modified after encryption (if so it won't decrypt correctly), you could then restore a backup or notify user of invalid data file. You could also compare plaintext file hashes after decryption, to ensure the file was decrypted properly.
Also, Rijndael requires block sizes to be in exact multiples of 16. Padding characters are added if needed to meet this requirement. The DCipher
class records the padded value in the final two bytes of the signature (making a total tag of 25-bytes, as DCipher
also uses an internal 4-byte signature to check file status). Then on decryption DCipher
is able to read this value and strips these bytes in the decryption process, so the decrypted file exactly matches the original file.
Using the code
Using this code is simple. The demo download demonstrates using this class in an MFC app.
To use the code, copy DCipher.h, DCipher.cpp, rijndael.h, and rijndael.cpp to your project folder, then using Project/Add To Project/Files add these to your project. Then in the class file which will be calling these functions (usually a view class or dialog class), add #include "DCipher.h"
at the top of the file with your other include
statements.
I'm going to describe using this in a MFC app, since you NON-MFC people should already understand a const char*
.
The EncryptFile()
and DecryptFile()
methods require that you pass the full filename (including path) along with a password and your signature array. Now normally you would declare your signature array globally or generate it inside a function, and retrieve your filepath from a CFileDialog
or the CDocument
class' GetPathname()
or the AppClass
' m_lpCmdLine
variable if the file is being opened from a double-click or command line request, or from the HDROPINFO
structure if it's a drag/drop operation. In any case, I will hard code them into this example for simplicity and readability. Your code, then, to encrypt a file would look like this:
int sig1[] = {1,1,0,0,4,8,6,0,2,9,4,1,1,0,5,6,0,7,3};
CString szFilename = "c:\\somepath\\somefile.ext";
CString szPass = "SomePassKey";
DCipher dc;
int nResult = dc.EncryptFile((LPCTSTR)szFilename,
(LPCTSTR)szPass, sig1);
if(nResult < 1)
else
Decrypting the file would use the exact same code, except you would call DecryptFile()
instead of EncryptFile()
.
CalcMD5FromString()
returns a mixed-case hash from string input -- pretty self-explanatory.
CalcMD5FromFile()
returns a 32-byte uppercase hash of a file. This doesn't just hash the string value of the filename -- it hashes the file contents and is used to check if a file has been modified. For that reason, you do not want to use this as an encryption key, because you would never be able to decrypt the file, unless the key was stored somewhere (which I think is a bad idea).
As an example, calling these methods in an MFC app would look like this:
DCipher dc;
CString szHashValue =
dc.CalcMD5FromFile((LPCTSTR)szFullFilePath);
DCipher dc;
CString szHashValue =
dc.CalcMD5FromString((LPCTSTR)szTextString);
That's all there is to it. It's an extremely easy class to use. It will work on any kind of file, including images. I have tested this class on both Win2000 and XP (always compiled with VC6)... and I have tested with files as small as 1K to 300K. I have not tested with extremely large files (say 2GB)... but it is very fast on the files I have tested.
I don't know of any compiler issues. Should you find anything that needs to be corrected, improved, or should you make significant modifications to the code, please pass those on to me so that I can include them in future updates, I will gladly credit you for your contributions.
Disclaimers
I should state that I did not author the MD5 code. I found it in the comments section of a CodeProject article, posted by someone aliased Elmue. As he didn't claim authorship for the code, I'm assuming it may have originally come from another source. If anyone knows who the rightful author is, I would be happy to give credit here.
I only made minor modifications:
- originally the file hash and string hash methods used the same string conversion function which returned an uppercase hex string. I added a separate return function for the string hash method, which returns a mixed case hex string -- which makes for a stronger encryption key, and
- replaced the Win32 file reading functions with UNIX compatible functions for portability.
History
- 2005.Jun.12
Initial CodeProject release.
- 2005.Jun.19
Changed Interface, added user passwords, added Rijndael algorithm.
- 2005.Jun.21
Removed Win32 calls in MD5 - hopefully fully portable now.
- 2005.July.5
Added GetSignature()
, SetChunkSize()
and memory validation.