Introduction
The .NET Framework doesn't have much in the way of built-in archive support. If you want to zip up some files, you can use the open source SharpZipLib or a commercial package like Xceed. There's also Zip API's in the J# library but you can't count on the assemblies being installed. A recent article in The Code Project called SimpleUnzipper inspired me to write a counterpart aptly named SimpleZip
.
Using the Code
SimpleZip
consists of one class with one public
method and about 450 lines of code. It can easily be added to an existing project and the code is small enough that it can be readily understood and modified. The one static
method is as follows:
static void ZipTo(IEnumerable<string> fileNames, Stream archiveStream)
It's use should be self-explanatory.
Points of Interest
The archiveStream
is of particular interest in that it does not have to be a seekable stream. To see why this is interesting (and important), it's worth taking a moment to understand a little of the Zip archive structure.
The Zip archive has a simple structure that supports file compression. The basic structure is:
[local file header 1]
[file data 1]
[data descriptor 1]
.
.
.
[local file header n]
[file data n]
[data descriptor n]
[archive decryption header]
[archive extra data record]
[central directory]
[end of central directory record]
Local file header:
local file header signature 4 bytes (0x04034b50)
version needed to extract 2 bytes
general purpose bit flag 2 bytes
compression method 2 bytes
last mod file time 2 bytes
last mod file date 2 bytes
crc-32 4 bytes
compressed size 4 bytes
uncompressed size 4 bytes
file name length 2 bytes
extra field length 2 bytes
file name (variable size)
extra field (variable size)
The local file header is of interest in that the compressed size, uncompressed size and CRC-32 all come before the actual file bits. This format makes it difficult to stream an archive from a generator without backtracking in the stream to update the size and CRC-32 values. This format is likely in part due to the "file" orientation of the original product.
Fortunately, a later update allows us to place the size and CRC-32 values after the file content in the data descriptor section. To do this, bit 3 of the general purpose bit flag is set. The decompressor then knows to look for the size and CRC-32 values after the file data.
With this modification, it is now possible to write to non-seekable streams. An example of a non-seekable stream you may encounter is Page.Response.OutputStream
from ASP.NET. Without this modification, the archive would have to be streamed to an intermediate stream (likely a file). There are security situations that sometime preclude using temporary files. And besides, it just a bit wasteful.
.NET has support for the Deflate compression algorithm used in PKZip and other popular Zip archive programs. While there are other more efficient compression algorithms available, Deflate does a "good enough" job for most situations. It's also fast and supported on many platforms. But herein lies a second problem. At the end of the deflation, we have to emit the compressed size. If we're writing to a non-seekable stream however, we can't use the Position or Length properties of the stream since these properties often raise a NotSupportedException
.
The number of bytes going into the compressor is in general different than the number of bytes emitted and yet we can't query the archive stream for the number of bytes written because the Position and Length properties are not supported in non-seekable streams.
To workaround this problem, a second in-line stream is added that simply counts the bytes as they go by. Since it is after the compressor (deflate) stream it accurately counts the number of bytes that will be written to the archive stream.
By now it should be apparent that one of my goals is to use this Zip generator in a Web application. With this class, I can easily send a Zip file as a response to a Web request as follows:
Page.Response.Clear();
Page.Response.ContentType ="application/zip";
Page.Response.AddHeader("Content-Disposition","attachment;filename="+"archive.zip");
SimpleZip.ZipTo(files, Page.Response.OutputStream);
Page.Response.End();
A couple more points in the code. CounterStream
, like DeflateStream
, has an option to leave the contained stream open when it is closed. The normal behavior of streams is to close the contained stream when closed or disposed. Obviously, the stream cannot be closed until the caller closes the stream.
Another annoying omission in the .NET library is there is no CRC-32 implementation. One is included here.
Since the code is small enough, I've included it here in the article. A tip about compiling this code bears mentioning. I've used Visual Studio 2008 to build this project and you'll see that it contains some of the new syntax extensions introduced in C# 3.0. The code is targeted for .NET 2.0 and does not use any of the C# 3.0/3.5 Framework libraries. Should you need to add this to a Visual Studio 2005 project, it should only take a few minutes to replace the C# 3.0 syntax with suitable C# 2.0 syntax.
using System;
using System.Collections.Generic;
using System.IO;
using System.IO.Compression;
using System.Text;
namespace BlueOnionSoftware.SimpleZip
{
static class SimpleZip
{
public static void ZipTo(IEnumerable<string> fileNames, Stream archiveStream)
{
if (fileNames == null)
{
throw new ArgumentNullException("fileNames");
}
if (archiveStream == null)
{
throw new ArgumentNullException("archiveStream");
}
if (archiveStream.CanWrite == false)
{
throw new ArgumentException("archiveStream is not writable");
}
var compressedFiles = new List<compressedfile>();
using (var counterStream = new CounterStream(archiveStream, true))
{
foreach (string file in fileNames)
{
CompressedFile compressedFile =
new CompressedFile(file, counterStream.Position);
compressedFile.EmitFileHeader(counterStream);
compressedFile.EmitFile(counterStream);
compressedFiles.Add(compressedFile);
}
long centralDirectoryOffset = counterStream.Position;
foreach (CompressedFile compressedFile in compressedFiles)
{
compressedFile.EmitCatalogFileHeader(counterStream);
}
long centralDirectorySize =
counterStream.Position - centralDirectoryOffset;
Write((uint)0x06054b50, counterStream);
Write((ushort)0, counterStream);
Write((ushort)0, counterStream);
Write((ushort)compressedFiles.Count, counterStream);
Write((ushort)compressedFiles.Count, counterStream);
Write((uint)centralDirectorySize, counterStream);
Write((uint)centralDirectoryOffset, counterStream);
Write((ushort)0, counterStream);
counterStream.Flush();
}
}
class CompressedFile
{
string FileName { get; set; }
int RelativeOffset { get; set; }
ushort GeneralPurpose { get; set; }
ushort StoreMethod { get; set; }
uint DosDateTime { get; set; }
uint Checksum { get; set; }
int CompressedLength { get; set; }
int UncompressedLength { get; set; }
string Name { get; set; }
public CompressedFile(string fileName, long relativeOffset)
{
FileName = fileName;
Name = NormalizeFileName(fileName);
RelativeOffset = (int)relativeOffset;
FileInfo fileInfo = new FileInfo(FileName);
GeneralPurpose = 8;
DosDateTime = GetDosDateTime(fileInfo.LastWriteTime);
StoreMethod = (ushort)((fileInfo.Length < 50) ? 0 : 8);
}
public void EmitFileHeader(Stream stream)
{
GeneralPurpose = stream.CanSeek ? (ushort)0 : (ushort)8;
Write((uint)0x04034b50, stream);
Write((ushort)0x14, stream);
Write(GeneralPurpose, stream);
Write(StoreMethod, stream);
Write(DosDateTime, stream);
Write(Checksum, stream);
Write((uint)CompressedLength, stream);
Write((uint)UncompressedLength, stream);
Write((ushort)Name.Length, stream);
Write((ushort)0, stream);
Write(Encoding.ASCII.GetBytes(Name), stream);
stream.Flush();
}
public void EmitFile(Stream stream)
{
using (var fileStream = File.OpenRead(FileName))
{
switch (StoreMethod)
{
case 0:
StoreFile(stream, fileStream);
break;
case 8:
DeflateFile(stream, fileStream);
break;
}
}
Write(Checksum, stream);
Write((uint)CompressedLength, stream);
Write((uint)UncompressedLength, stream);
}
void StoreFile(Stream archiveStream, Stream sourceStream)
{
int count = 1;
byte[] buffer = new byte[1024];
uint crc = Crc32.BeginChecksum();
while (count != 0)
{
count = sourceStream.Read(buffer, 0, buffer.Length);
UncompressedLength += count;
crc = Crc32.UpdateChecksum(crc, buffer, 0, count);
archiveStream.Write(buffer, 0, count);
}
Checksum = Crc32.EndChecksum(crc);
CompressedLength = UncompressedLength;
}
void DeflateFile(Stream archiveStream, Stream sourceStream)
{
using (var counterStream = new CounterStream(archiveStream, true))
{
using (var deflateStream = new DeflateStream
(counterStream, CompressionMode.Compress, true))
{
int count = 1;
byte[] buffer = new byte[1024];
uint crc = Crc32.BeginChecksum();
while (count != 0)
{
count = sourceStream.Read(buffer, 0, buffer.Length);
UncompressedLength += count;
crc = Crc32.UpdateChecksum(crc, buffer, 0, count);
deflateStream.Write(buffer, 0, count);
}
Checksum = Crc32.EndChecksum(crc);
}
CompressedLength = (int)counterStream.Position;
}
}
public void EmitCatalogFileHeader(Stream stream)
{
Write((uint)0x02014b50, stream);
Write((ushort)0x14, stream);
Write((ushort)0x14, stream);
Write(GeneralPurpose, stream);
Write(StoreMethod, stream);
Write(DosDateTime, stream);
Write(Checksum, stream);
Write((uint)CompressedLength, stream);
Write((uint)UncompressedLength, stream);
Write((ushort)Name.Length, stream);
Write((ushort)0, stream);
Write((ushort)0, stream);
Write((ushort)0, stream);
Write((ushort)0, stream);
Write((uint)0x00000020, stream);
Write((uint)RelativeOffset, stream);
Write(Encoding.ASCII.GetBytes(Name), stream);
stream.Flush();
}
}
static class Crc32
{
readonly static uint[] table = new uint[] {
0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA,
0x076DC419, 0x706AF48F, 0xE963A535, 0x9E6495A3,
0x0EDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988,
0x09B64C2B, 0x7EB17CBD, 0xE7B82D07, 0x90BF1D91,
0x1DB71064, 0x6AB020F2, 0xF3B97148, 0x84BE41DE,
0x1ADAD47D, 0x6DDDE4EB, 0xF4D4B551, 0x83D385C7,
0x136C9856, 0x646BA8C0, 0xFD62F97A, 0x8A65C9EC,
0x14015C4F, 0x63066CD9, 0xFA0F3D63, 0x8D080DF5,
0x3B6E20C8, 0x4C69105E, 0xD56041E4, 0xA2677172,
0x3C03E4D1, 0x4B04D447, 0xD20D85FD, 0xA50AB56B,
0x35B5A8FA, 0x42B2986C, 0xDBBBC9D6, 0xACBCF940,
0x32D86CE3, 0x45DF5C75, 0xDCD60DCF, 0xABD13D59,
0x26D930AC, 0x51DE003A, 0xC8D75180, 0xBFD06116,
0x21B4F4B5, 0x56B3C423, 0xCFBA9599, 0xB8BDA50F,
0x2802B89E, 0x5F058808, 0xC60CD9B2, 0xB10BE924,
0x2F6F7C87, 0x58684C11, 0xC1611DAB, 0xB6662D3D,
0x76DC4190, 0x01DB7106, 0x98D220BC, 0xEFD5102A,
0x71B18589, 0x06B6B51F, 0x9FBFE4A5, 0xE8B8D433,
0x7807C9A2, 0x0F00F934, 0x9609A88E, 0xE10E9818,
0x7F6A0DBB, 0x086D3D2D, 0x91646C97, 0xE6635C01,
0x6B6B51F4, 0x1C6C6162, 0x856530D8, 0xF262004E,
0x6C0695ED, 0x1B01A57B, 0x8208F4C1, 0xF50FC457,
0x65B0D9C6, 0x12B7E950, 0x8BBEB8EA, 0xFCB9887C,
0x62DD1DDF, 0x15DA2D49, 0x8CD37CF3, 0xFBD44C65,
0x4DB26158, 0x3AB551CE, 0xA3BC0074, 0xD4BB30E2,
0x4ADFA541, 0x3DD895D7, 0xA4D1C46D, 0xD3D6F4FB,
0x4369E96A, 0x346ED9FC, 0xAD678846, 0xDA60B8D0,
0x44042D73, 0x33031DE5, 0xAA0A4C5F, 0xDD0D7CC9,
0x5005713C, 0x270241AA, 0xBE0B1010, 0xC90C2086,
0x5768B525, 0x206F85B3, 0xB966D409, 0xCE61E49F,
0x5EDEF90E, 0x29D9C998, 0xB0D09822, 0xC7D7A8B4,
0x59B33D17, 0x2EB40D81, 0xB7BD5C3B, 0xC0BA6CAD,
0xEDB88320, 0x9ABFB3B6, 0x03B6E20C, 0x74B1D29A,
0xEAD54739, 0x9DD277AF, 0x04DB2615, 0x73DC1683,
0xE3630B12, 0x94643B84, 0x0D6D6A3E, 0x7A6A5AA8,
0xE40ECF0B, 0x9309FF9D, 0x0A00AE27, 0x7D079EB1,
0xF00F9344, 0x8708A3D2, 0x1E01F268, 0x6906C2FE,
0xF762575D, 0x806567CB, 0x196C3671, 0x6E6B06E7,
0xFED41B76, 0x89D32BE0, 0x10DA7A5A, 0x67DD4ACC,
0xF9B9DF6F, 0x8EBEEFF9, 0x17B7BE43, 0x60B08ED5,
0xD6D6A3E8, 0xA1D1937E, 0x38D8C2C4, 0x4FDFF252,
0xD1BB67F1, 0xA6BC5767, 0x3FB506DD, 0x48B2364B,
0xD80D2BDA, 0xAF0A1B4C, 0x36034AF6, 0x41047A60,
0xDF60EFC3, 0xA867DF55, 0x316E8EEF, 0x4669BE79,
0xCB61B38C, 0xBC66831A, 0x256FD2A0, 0x5268E236,
0xCC0C7795, 0xBB0B4703, 0x220216B9, 0x5505262F,
0xC5BA3BBE, 0xB2BD0B28, 0x2BB45A92, 0x5CB36A04,
0xC2D7FFA7, 0xB5D0CF31, 0x2CD99E8B, 0x5BDEAE1D,
0x9B64C2B0, 0xEC63F226, 0x756AA39C, 0x026D930A,
0x9C0906A9, 0xEB0E363F, 0x72076785, 0x05005713,
0x95BF4A82, 0xE2B87A14, 0x7BB12BAE, 0x0CB61B38,
0x92D28E9B, 0xE5D5BE0D, 0x7CDCEFB7, 0x0BDBDF21,
0x86D3D2D4, 0xF1D4E242, 0x68DDB3F8, 0x1FDA836E,
0x81BE16CD, 0xF6B9265B, 0x6FB077E1, 0x18B74777,
0x88085AE6, 0xFF0F6A70, 0x66063BCA, 0x11010B5C,
0x8F659EFF, 0xF862AE69, 0x616BFFD3, 0x166CCF45,
0xA00AE278, 0xD70DD2EE, 0x4E048354, 0x3903B3C2,
0xA7672661, 0xD06016F7, 0x4969474D, 0x3E6E77DB,
0xAED16A4A, 0xD9D65ADC, 0x40DF0B66, 0x37D83BF0,
0xA9BCAE53, 0xDEBB9EC5, 0x47B2CF7F, 0x30B5FFE9,
0xBDBDF21C, 0xCABAC28A, 0x53B39330, 0x24B4A3A6,
0xBAD03605, 0xCDD70693, 0x54DE5729, 0x23D967BF,
0xB3667A2E, 0xC4614AB8, 0x5D681B02, 0x2A6F2B94,
0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D};
static public uint BeginChecksum()
{
return 0xffffffff;
}
static public uint UpdateChecksum
(uint crc, byte[] bytes, int offset, int count)
{
for (int i = offset; i < count; i++)
{
byte index = (byte)(((crc) & 0xff) ^ bytes[i]);
crc = (uint)((crc >> 8) ^ table[index]);
}
return crc;
}
static public uint EndChecksum(uint crc)
{
return ~crc;
}
}
class CounterStream : Stream
{
int Count { get; set; }
Stream InternalStream { get; set; }
bool LeaveOpen { get; set; }
public CounterStream(Stream stream, bool leaveOpen)
{
if (stream == null)
{
throw new ArgumentNullException("stream");
}
InternalStream = stream;
LeaveOpen = leaveOpen;
}
public override bool CanRead
{
get { return false; }
}
public override bool CanSeek
{
get { return false; }
}
public override bool CanWrite
{
get { return InternalStream.CanWrite; }
}
public override void Flush()
{
InternalStream.Flush();
}
public override long Length
{
get { throw new System.NotSupportedException(); }
}
public override long Position
{
get
{
return Count;
}
set
{
throw new System.NotSupportedException();
}
}
public override int Read(byte[] buffer, int offset, int count)
{
throw new System.NotSupportedException();
}
public override long Seek(long offset, SeekOrigin origin)
{
throw new System.NotSupportedException();
}
public override void SetLength(long value)
{
throw new System.NotSupportedException();
}
public override void Write(byte[] buffer, int offset, int count)
{
InternalStream.Write(buffer, offset, count);
Count += count;
}
protected override void Dispose(bool disposing)
{
try
{
if (disposing)
{
if (InternalStream != null && LeaveOpen == false)
{
InternalStream.Close();
}
}
}
finally
{
InternalStream = null;
base.Dispose(disposing);
}
}
}
static uint GetDosDateTime(DateTime date)
{
return (uint)(
((date.Year - 1980 & 0x7f) << 25) |
((date.Month & 0xF) << 21) |
((date.Day & 0x1F) << 16) |
((date.Hour & 0x1F) << 11) |
((date.Minute & 0x3F) << 5) |
(date.Second >> 1));
}
static void Write(uint number, Stream stream)
{
byte[] buffer = BitConverter.GetBytes(number);
stream.Write(buffer, 0, buffer.Length);
}
static void Write(ushort value, Stream stream)
{
byte[] buffer = BitConverter.GetBytes(value);
stream.Write(buffer, 0, buffer.Length);
}
static void Write(byte[] buffer, Stream stream)
{
stream.Write(buffer, 0, buffer.Length);
}
static string NormalizeFileName(string fileName)
{
string name = fileName.Replace("\\", "/");
name = name.Replace("../", "");
int i = name.IndexOf(':');
if (i != -1)
{
name = name.Substring(i + (Path.IsPathRooted(name) ? 2 : 1));
}
return name;
}
}
}
Limitations
SimpleZip
will not leave WinZip and its brethren quivering in their boots anytime soon. While small and easy to use, there are lots of things SimpleZip
does not support. For instance, you cannot modify an existing archive. The .NET Deflate algorithm is not as effective as other implementations in part due to its streaming interface. Also, large archive support (> 2 GB) is not supported since .NET does not have Deflate64 support. There is no support for file/archive comments or encryption.
Still, I think SimpleZip
will work satisfactorily in many situations. Feedback, enhancements and bug fixes are welcome.
History
- 25th March, 2008: Article posted
Note: You can find updates and bug fixes to this code here.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.