Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / VB10

SharpZipLib or DotNetZip... Which Should You Use?

4.54/5 (8 votes)
5 Aug 2012CPOL9 min read 104.9K   3.4K  
A comparison of these two free zip libraries, complete with a multi-threaded VB.NET class wrapper for each

Image 1

Introduction

This example project comes with two classes that you might be interested in - clsSharpZipLib and clsDotNetZip. These are wrappers for the core functionality of these two libraries - because as user friendly as they try to be (specially DotNetZip), there is still quite a bit of code that goes into using them. If you're like me, you won't want to have to rewrite it over and over for every project you work on. And they block, so you'll need to create some threads if you don't want to lock up your UI... and then there's the issue of tracking progress.

These classes handle it all for you. They offer a handy callback you can use to track progress, get error messages, etc. This example project shows you how to use it all.

As nice as this is, it's really not the point of this article though. This article is a comparison of the two zip libraries, because it seems that while there are quite a few opinions on them out there, the hard data I found was either misleading or just plain wrong. Using this example project, you will be able to determine for yourself which library is right for you.

Background

Recently, I found myself needing to add limited zip functionality to the project I was working on. My application needed to zip files as fast as possible, but would never be called upon to open the zip files it created. Years ago, I had written a wrapper for #ZipLib, and so I dug it out - and looked on in chagrin at what I'd produced back then... it needed a rewrite.

So I found myself rewriting this wrapper... and struggling to figure out why #ZipLib was SO SLOW. At first, I thought is was my code, but Googling for a solution I discovered lots of people asking about the same thing. It seems that this is just how #ZipLib is. If you use it to compress your zip files (as opposed to adding them to a zip archive uncompressed), expect it to take about twice as long as winrar or winzip, even at the lowest compression level.

I found some reviews on DotNetZip - the other free solution, and was disappointed to read that even though people thought it was much easier to work with, it was a little slower. Luckily, this wasn't true.

In fact, I eventually found a Stack Overflow page where one of the posters wrote about DotNetZip's ParallelDeflateOutputStream class. This class uses multiple threads to compress files, increasing the compression speed by using all the cores in your system.

After much testing, the project I built to refine and test the wrapper class I would eventually use became this - an application designed to compare the performance of each library.

If you choose to run this app, you will be able to create zip files, list their contents (in a primitive way), choose files to extract, set compression levels (if you choose to use compression at all), set the password, choose to use Zip64 (or not), and most importantly choose which of the two libraries to use to preform your zip operation by clicking a check box.

Every operation is timed, and the results are displayed after the zip / unzip is complete.

I developed this test application on my Windows 7 x64 quad core laptop. It has 4 gig of ram, and a 5400 RPM hard drive. During my testing, I used a source folder containing almost 7000 files totalling 873MB.

On this machine, DotNetZip uses all 4 cores while compressing large files, and compresses my test source folder in less then half the time it takes #zipLib. The actual average times are:

  • #ZipLib: Compression level 1, Creates a 684 MB file, completes in 1 minute, 45 seconds.
  • DotNetZip: Compression level 1, creates a 690 MB file, completes in 45 seconds.

I think it's interesting to note here that it takes WinRar 1 minute and 5 seconds, on the average, to compress these same files.

When it comes to extracting these files, #ZipLib beats DotNetZip by about 15 seconds.

Again, these are average times. It was hard for me to write 1 minute, 45 seconds here for #ZipLib, because there were test runs where it actually took almost 3 minutes. The times listed above were observed after repeated zipping and extracting, when window's file cache was working as well as it possibly can.

I realize that my testing is just that - my testing, run on my hardware and that these numbers will be different elsewhere.

If you're interested, I invite you to download the example project and do some testing of your own. If you choose to post your results here, we'll all have a better understanding of how these two libraries stack up against each other.

I realize that most people arriving at this page probably came here for a quick and easy way to add zip functionality to their VB.NET app - so I took the time to separate the functionality of each library into its own wrapper class. If you decide you want to use DotNetZip, just copy clsDotNetZip into your project, add a reference to Ionic.zip.dll, have a look at the example project (or a look below - it's very simple) to see how to implement it, and your off. For #ZipLib, it's clsSharpzipLib and ICSharpCode.SharpZipLib.dll.

Using the Code

Instantiation

Instantiating one of the classes looks like this:

VB.NET
Dim zipLib As clsSharpZipLib = New clsSharpZipLib(zipPath, _
                                                  clsSharpZipLib.ZipAccessFlags.Create, _
                                                  1024 * 1024, _
                                                  nudCompression.Value, _
                                                  cbZip64.Checked, _
                                                  tbPassword.Text, _
                                                  100, _
                                                  AddressOf zipCallback)

Now, I think this is pretty straight forward. But then I wrote the class, and I would - so I'll explain what we have here.

  • zipPath - is a string containing the path of the zip file you want to open, or the location of one you would like to create.
  • clsSharpZipLib.ZipAccessFlags.Create - This is a public enum you'll find in the class. It's a file access flag - it tells the class what you'll be doing with the zip file.
  • 1024 * 1024 - This is the size of the buffer you'd like this class to work with.
  • nudCompression.Value - This is a numeric up down control I use to specify the compression level in the example project. Valid values are 0 - 9.
  • cbZip64.Checked - Yep - This is a checkbox control. If you've checked it, then you'll be compressing using Zip64.
  • tbPassword.Text - Self explanatory.
  • 100 - This is the callback update speed in milliseconds. I'm passing the value 100 here, so the callback in this example project will fire once every 100 milliseconds, containing data you can use to update your user interface about the current operation.
  • AddressOf zipCallback - zipCallback is the address of the callback Sub in the example project. All the good stuff happens off the UI thread, so if you want information about how your zip operation's going, you'll have to supply one of these.

In the example project, I track the overall progress in bytes, and the progress of the current file being processed in bytes. Doing it like this just seems like the right way to do it, and makes for smooth and accurate progress bars.

The Callback

This is what the callback sub in the example app looks like:

VB.NET
Private Sub zipCallback(ByRef zipData As clsCompareLibs.ZipData)
 
        Static lastName As String
 
        If Me.InvokeRequired Then
            Me.Invoke(callback, zipData)
        Else
 
            With zipData
                If .fileList IsNot Nothing AndAlso .fileList.Count > 0 Then
                    ' We've received a list of files. Add them to the listbox.
                    Dim names As New List(Of String)
                    currentEntries.Clear()
 
                    For Each entry As clsCompareLibs.ShortEntry In .fileList
                        names.Add(entry.name)
                        currentEntries.Add(entry)
                    Next
 
                    Me.lbFileList.Items.AddRange(names.ToArray())
 
                    Me.lblFileName.Text = "Complete."
 
                    Try
                        zipLib.Close()
                    Catch ex As Exception
                    End Try
 
                    me.Cursor = System.Windows.Forms.Cursors.Default
                Else
                    ' We're updating the UI with progress data here.

                    ' Have we moved on to a new file?
                    If lastName <> zipData.currentFileName Then
                        ' If so, set the progress bar to 0.
                        pbCurrentFile.Value = 0 
                        lastName = zipData.currentFileName
                    End If
 
                    lblFileName.Text = .operationTitle
                    If .currentFileName <> "" Then lblFileName.Text += ": ...\" & _
                                              Path.GetFileName(.currentFileName)
                    If .currentFileBytesCopied > 0 AndAlso .totalBytes > 0 Then
                        pbCurrentFile.Value = (.currentFileBytesCopied / .currentFileLength) * 100
                        pbTotalBytes.Value = (.totalBytesCopied / .totalBytes) * 100
 
                        pbCurrentFile.Refresh()
                        pbTotalBytes.Refresh()
                    End If
 
                    If .complete Then
                        zipLib.Close()
                        If .cancel Then 
                            lblFileName.Text = "Canceled."
                            pbCurrentFile.Value     = 0
                            pbTotalBytes.Value      = 0
                        Else
                            endTime = Now
                            If endTime.Subtract(startTime).TotalSeconds > 60 then
                                lblFileName.Text = "Complete. This operation took " & _
                                    endTime.Subtract(startTime).Minutes.ToString() & _
                                    " minutes, and " & endTime.Subtract(startTime).Seconds.ToString() _
                                    & " seconds."
                            Else
                                lblFileName.Text = "Complete. This operation took " & _
                                    endTime.Subtract(startTime).TotalSeconds.ToString("N1") & _
                                    " seconds."
                            End If
                        End If
                        
                        tsbZipFiles.Visible         = True
                        tsbZipFiles.Enabled         = True
                        tsbListZipEntries.Visible   = True
                        tsbCancel.Visible           = False
                        me.Cursor = System.Windows.Forms.Cursors.Default
                    End If
 
                    If .errorMessage <> "" Then 
                        MsgBox("" & .errorMessage, MsgBoxStyle.Critical, "Zip Example App")
                        me.Cursor = System.Windows.Forms.Cursors.Default
                    End If
                End If
            End With
        End If
 
    End Sub 

You'll see that each time the callback fires, you get the state of the current operation. Everything you need to track progress is there - the current file's name, the number of bytes currently transferred for that file, the total number of bytes being copied, the current total transferred, the title of the operation (i.e.: "Extracting", or "Zipping"), error messages, etc.

Specifying Files to be Zipped / Unzipped

I tried to make the class interface as much like a generic list as possible. That being written, to add files to a zip file you use the Add() method. Add() will accept a string or a generic list(Of string). You can pass it the path of a single file, or a folder in each entry. If you're passing a folder, you can also specify if you want this class to recourse sub-directories.

To get a list of entries in the zip file, use ListZipEntries(). The ListZipEntries() method doesn't return a list. Everything with this class happens off the UI thread. The list is returned in the callback. See above how to retrieve it.

To extract files from a zip, you call - you guessed it - Extract(). The Extract() method has three overloads: you can pass it a single string containing the entry to be extracted and a target folder, a list(Of String) containing entries to be extracted and a target folder, or a List(Of ShortEntry) and a string containing the target folder.

A List(Of ShortEntry) is what you get back when you call ListZipEntries(). ShortEntries are just structures containing the entry name, its size, and its index. Passing Extract() a List(Of ShortEntry) will improve performance.

But Isn't There More?

I'm sure you're aware that these libraries do more then just zip and unzip files. DotNetZip alone can create self extracting zip files, break zip files up into parts, and much more. I didn't try to wrap all the functionality of each library... it would have taken me forever - and remember, all I needed to start with was a way to quickly zip some files for my current project. If you want more functionality than this, you'll need to add it yourself.

Points of Interest

The relevance of these two libraries, and this article may actually be in question. As of .NET 4.5, Microsoft is including a ZipArchive class as part of the framework. I had a quick look at it on MSDN, and as of this writing it isn't anywhere near as flexible as DotNetZip, though I'm sure that will change over time. What may keep these libraries relevant is superior performance and functionality that Microsoft doesn't offer, though I think this will put #ZipLib out of the running as its compression performance isn't great and it seems to be abandoned by the developer.

I guess we'll see what happens.

History

  • 08/04/2012 - Fixed a crash if the list zip entries button is clicked without a zip file selected
  • 08/05/2012 - Rebuilt the project with option explicit and option strict on

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)