Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

#zlib - Modifying Archives

4.35/5 (9 votes)
14 May 20073 min read 2   806  
A way to modify Zip archives without extracting them completely.

Demo app

Introduction

This article shortly explains some classes that give you the possibility to modify an archive without completely extracting it. The way I solved it is not really beautiful, but it works. I used a mix of the existing C# wrapper code for zlib (together with minizip) and some of my own code. A better way would be writing a wrapper class for libzip. And probably, the best way would be enhancing SharpZipLib so that it can modify archives. So feel free to work on that ;-).

Also note that these classes do not support enhanced settings like compression levels or archive comments because I didn't need them. But you could easily add these features.

Background

Everyone who needs compression/decompression for C# most probably would use the SharpZipLib from IC#Code. I also did that. Until now. I have been using that library for compressing the data my program generates. Fortunately, the output was all of the same type, so I was able to save it in the same file. But now I have different types of data that belong to different parts of the program, so splitting up the data into files would be a good way to achieve this. I wanted to modify this archive in memory, because it is rather ugly to completely extract it. I've been searching around for a while, and I only found an old C# wrapper class from Gerry Shaws. I knew that SharpZipLib is no solution for me, so I downloaded it and tried to get it working. There were some bugs in it that caused some weird behaviour which vanished after I replaced all the unsafe parts with normal C# code (you have to now, Gerry's code is a little bit weird ;). And well, besides that, I adjusted the code heavily to my needs. I replaced his rather hard to handle ZipReader and ZipWriter classes with the ZipStream and ZipArchive classes, and that allows a way more easier way to read and write to an archive. It kind of hides the ugliness of the unmanaged zlib functions :P.

Using the code

I have made modifying of the archives (hopefully) very simple. You just need to instantiate an Archive class that gives you access to an array of streams.

C#
// Don't forget to add this two lines
using System.IO;
using OrganicBit.Zip;

ZipArchive Archive = new ZipArchive("test.zip", FileAccess.ReadWrite);

byte[] Data = new byte[Archive["test.txt"].Length];

Archive["test.txt"].Read(Data,0,Data.Length);

Console.WriteLine("Contents of 'test.txt'\n"+ 
        System.Text.Encoding.Default.GetString(Data));
Console.WriteLine("Contents of the new file:");

Data = System.Text.Encoding.Default.GetBytes(Console.ReadLine());

Archive["test.txt"].Write(Data,0,Data.Length);

Archive.Close();

Notice that even if test.zip does not exist, you have to open it with FileAccess.Write, and then a new archive will be created automatically. Also, when you write to an existing archive and the file is not in the archive yet, it will be created.

To see what is in an archive, you can simply enumerate your instance:

C#
// Besides the Name, a ZipEntry
// instance also offers lots of other information
foreach(ZipEntry entry in Archive) 
  Console.WriteLine(entry.Name);

You can also check whether a file exists, or delete some:

C#
if(Archive.Contains("test.txt"))
  Console.WriteLine("File test.txt is there");

// Archive.Delete() deletes the whole archive
// when it doesn't contain any other files
bool success = Archive.DeleteFile("test.txt");
if(success) 
  Console.WriteLine("File has been deleted");
else Console.WriteLine("File is not in archive");

Limitations

  • Can only create/modify .zip archives.
  • Hard-coded settings (comment/compression level/etc.).
  • No encryption.
  • Most probably limited to 2 GB archives or even less.

Points of interest

The unlovely part of zlib together with minizip is the fact that you can't open an archive for reading and writing simultaneously. I hid that fact in my classes, and exposed only the Read and Write functions. But behind that, my class closes and reopens the archive whenever you switch from a read to a write function or vice versa. Second, zlib/minizip do not support any possibility to modify and/or delete files. But there is a link on the minizip homepage to a small example in C++ on how to delete files, written by Ivan A. Krestinin. I just converted that function to C#, and added it to my ZipArchive class. So, whenever you write to an existing entry, the old file gets deleted first!

Credits

...go to the following people:

  • The zlib and minizip developers.
  • Gerry Shaw's C# Wrapper for zlib, on which my work is based on.
  • Ivan A. Krestinin, for his example on how to delete files from an archive.

History

  • 2006-04-08
    • Inserted the missing LGPL header.
    • Fixed two severe bugs.
  • 2006-04-25
    • Article creation.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here