Introduction
This article discusses string compression with optional decent encryption with pure VB.NET code, and no external tools required.
It can easily be integrated into existing projects. As the code is kept simple, it's suitable for beginners and a conversion to C# can be done easily.
Background
In need of a routine to quickly and safely deflate and inflate big string
s, I searched the net for a solution. A comprehensive set of functionalities didn't show up, so I decided to write this class module, which encapsulates all the functionality needed to complete the task.
Using the Code
Although string
s of any length can be applied to the process, the compression of short string
s (i.e. 'Hello World!
') is counterproductive as it results in even bigger compressed counterparts. The CompressionRatio
property of the class tells you how effective the compression was. You can decide then, if you want to use the compressed string
and if so, prefix and suffix can be automatically applied to it, to distinguish between compressed and uncompressed content afterwards.
Process overview:
Plain text -> to byte array -> gzip compression -> encryption -> to base64 string = shrinked text
shrinked text -> to byte array -> decryption -> gzip decompression -> to string = plain text
The code is simple to use. Here's the quick way to compress a string
:
Dim CompStr As New clsCompressedString(System.Text.Encoding.UTF8)
CompStr.UnCompressed = "some large text content..."
MsgBox "The compressed string is: " & CompStr.Compressed
... and the way back:
Dim CompStr As New clsCompressedString(System.Text.Encoding.UTF8)
CompStr.Compressed = "..."
MsgBox "The uncompressed string is: " & CompStr.UnCompressed
Error handling is kept at the minimum. The class returns empty string
s when fed with corrupt data or supplied with wrong passphrase.
Optional encryption is performed by utilizing the .NET built-in RijndaelManaged at maximum key length and simplified usage: You just need to provide a single passphrase for encryption and decryption. Encryption key and iv are generated based on the passphrase by using SHA256 and MD5 hash value generation.
The demo project shows all features available.
Points of Interest
With string
conversions involved, text encoding has to be addressed properly. Otherwise some or all characters could get messed up in the process of compression/decompression, depending on what content you try to compress/decompress.
Why Not Use ICSharpCode.SharpZipLib?
Well, you can easily alter the compression routines in the class to use ZipLib. I experimented with that and it showed, that ZipLib (0.85.4.369) is only up to 7% more efficient than the built in GZip. To get this slight better performance, you have to set ZipLib to the highest compression level (9). But that comes with a price: ZipLib at highest level is very slow compared to GZip and therefore takes several times longer to compress a huge string
. So I prefer GZip for this task as it is fast, reliable and doesn't require to link to additional DLLs and I don't run into licensing and security issues by using comprehensive third party code.
Preferences could possibly change when it comes to binary file compression. Maybe then ZipLib outruns GZip - but binary file compression was not the assigned task in this case.
History
- 1st July, 2008: This is the first version. Participate and help to optimize and extend the code.