Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / XML

Zip/Unzip using the java.util.zip .NET namespace and more

4.49/5 (17 votes)
12 Feb 2013CPOL5 min read 1   3.1K  
Zip/Unzip using java.util.zip from managed code.

Screenshot - zipstrip2.png

Table of Contents

  • Introduction
  • Background
  • Listing the content of a Zip file
  • Zipping files from a folder
  • Unzipping files from a Zip file
  • Changing the Zip file
  • Improvements over version 1.0
  • Using the wild chars for the filter
  • Explorer folder context menu
  • History

    Introduction

    When dealing with Zip files, you have a few choices: use native APIs from third party DLLs, Java APIs, or .NET APIs. If you rush to use APIs from the System.IO.Compress .NET namespace, you will be very disappointed. For reasons only Microsoft knows, the support is limited to streams only, and lacks completely for multi-file archives. This was probably a reason why third party .NET libraries like SharpZipLib cropped up. If you don't trust the free software, you might be surprised to find out that you can find .NET support for multi-file archives in .NET buried in J# assemblies that offer parity with Java APIs. To make a useful application that uses it, I started with an existing CodeProject application that is very handy when backing up source code. I replaced the SharpZipLib references and used the Microsoft J# APIs instead. When porting the application, I noticed that the SharpZipLib APIs were looking very similar with the J# APIs, and that made my work so much easier. To make this utility more enticing to use, I've added quite a few features that I will detail below.

    Background

    In order to use Microsoft's API for multi-file zips and Java streams, you have to add the vjslib.dll and vjslibcw.dll .NET assemblies as project references. They are part of the J# distribution pack. The Java like types will show up in the java.util.zip namespace. Since Microsoft's documentation on this topic is quite sparse, I often had to rely on intellisense to figure it out. For simplicity's sake, some nonessential UI code is omitted below, and can be found only in the source code provided.

    Listing the content of a Zip file

    Below, you could see a snippet of code edited for simplicity that enumerates the files in the archive:

    C#
    public static List<string > GetZipFileNames(string zipFile)
    {
        ZipFile zf = null;
        List<string > list = new List<string >();
        try
        {
            zf = new ZipFile(zipFile);
            java.util.Enumeration enu = zf.entries();
            while (enu.hasMoreElements())
            {
                ZipEntry zen = enu.nextElement() as ZipEntry;
                if (zen.isDirectory())
                    continue;//ignore directories
                list.Add(zen.getName());
            }
    
        }
        catch(Exception ex) 
        {
            throw new ApplicationException("Please drag/drop only valid " 
                      "zip files\nthat are not password protected.",ex);
        }
        finally
        {
            if (zf != null)
                zf.close();               
        }
        return list;
    }

    As you probably noticed, ZipEntry and ZipFile are easy to use for this goal.

    Zipping files from a folder

    Below, you could see a helper method used to Zip files from a folder:

    C#
    private static void _CreateZipFromFolder(string Folder, IsFileStrippableDelegate IsStrip)
    {
        System.IO.DirectoryInfo dirInfo =
            new System.IO.DirectoryInfo(Folder);
        System.IO.FileInfo[] files = dirInfo.GetFiles("*");//all files
        foreach (FileInfo file in files)
        {
            if (IsStrip != null && IsStrip(file.FullName))
                continue;//skip, don't zip it
            java.io.FileInputStream instream = new java.io.FileInputStream(file.FullName);
            int bytes = 0;
            string strEntry = file.FullName.Substring(m_trimIndex);
            _zos.putNextEntry(new ZipEntry(strEntry));
            while ((bytes = instream.read(_buffer, 0, _buffer.Length)) > 0)
            {
                _zos.write(_buffer, 0, bytes);
            }
            _zos.closeEntry();
            instream.close();
        }
    
        System.IO.DirectoryInfo[] folders = null;
        folders = dirInfo.GetDirectories("*");
        if (folders != null)
        {
            foreach (System.IO.DirectoryInfo folder in folders)
            {
                _CreateZipFromFolder(folder.FullName, IsStrip);
            }
        }
    }

    The IsStrip delegate acts as a filter that trashes the unwanted files.

    Unzipping files from a Zip file

    Below, you could see an edited for brevity piece of code used to unzip the files from a Zip:

    C#
    ZipInputStream zis = null;
    zis = new ZipInputStream(new java.io.FileInputStream(file));
    ZipEntry ze = null;
    while ((ze = zis.getNextEntry()) != null)
    {
        if (ze.isDirectory())
            continue;//ignore directories
        string fname = ze.getName();
        bool bstrip = IsStrip != null && IsStrip(fname);
        if (!bstrip)
        {
            //unzip entry
            int bytes = 0;
            FileStream filestream = null;
            BinaryWriter w = null;
            string filePath = Folder + @"\" + fname;
            if(!Directory.Exists(Path.GetDirectoryName(filePath)))
                Directory.CreateDirectory(Path.GetDirectoryName(filePath));
            filestream = new FileStream(filePath, FileMode.Create);
            w = new BinaryWriter(filestream);
            while ((bytes = zis.read(_buffer, 0, _buffer.Length)) > 0)
            {
                for (int i = 0; i < bytes; i++)
                {
                    unchecked
                    {
                        w.Write((byte)_buffer[i]);
                    }
                }
            }
        }
        zis.closeEntry();
        w.Close();
        filestream.Close();
    
        }
        if (zis != null)
            zis.close();
    }

    Again, the IsStrip delegate acts as a filter that trashes the unwanted files. Also, I had to mix java.io with the System.IO namespace because of the sbyte[] array.

    Changing the Zip file

    You can not directly modify a Zip file. However, you can create another Zip and copy only select files in it. When the transfer is complete, we can rename the new file as the original and it would look like we changed the Zip. The edited for brevity method below receives a list of strings with the unwanted files:

    C#
    public static void StripZip(string zipFile, List<string > trashFiles)
    {
        ZipOutputStream zos = null;
        ZipInputStream zis = null;
        //remove 'zip' extension
        bool bsuccess = true;
        string strNewFile = zipFile.Remove(zipFile.Length - 3, 3) + "tmp";
        zos = new ZipOutputStream(new java.io.FileOutputStream(strNewFile));
        zis = new ZipInputStream(new java.io.FileInputStream(zipFile));
        ZipEntry ze = null;
        while ((ze = zis.getNextEntry()) != null)
        {
            if (ze.isDirectory())
                continue;//ignore directories
            string fname = ze.getName();
            bool bstrip = trashFiles.Contains(fname);
            if (!bstrip)
            {
                //copy the entry from zis to zos
                int bytes = 0;
                //deal with password protected files
                zos.putNextEntry(new ZipEntry(fname));
                while ((bytes = zis.read(_buffer, 0, _buffer.Length)) > 0)
                {
                    zos.write(_buffer, 0, bytes);
                }
                zis.closeEntry();
                zos.closeEntry();
            }
        }
        if (zis != null)
            zis.close();
        if (zos != null)
            zos.close();
        if (bsuccess)
        {
            System.IO.File.Delete(zipFile + ".old");
            System.IO.File.Move(zipFile, zipFile + ".old");
            System.IO.File.Move(strNewFile, zipFile);
        }
        else
            System.IO.File.Delete(strNewFile);
    }

    Improvements over version 1.0

    To make this tool more attractive, I've added some improvements of my own: The first one to notice is the usage of a checked list box that allows doing manual changes on the fly. My favorite is the ability to edit the list of filter extensions that are bound to the CPZipStripper.exe.xml file through a DataTable. Here is an edited snapshot of this file:

    XML
    <configuration>
      <maskRow maskField="*.plg" / >
      <maskRow maskField=".opt" / >
      <maskRow maskField=".ncb" / >
      <maskRow maskField=".suo" / >
      <maskRow maskField="*.pdb" / >
    ......
    </configuration>

    Reading data from the config file

    Notice that in the application configuration file, we keep not only the appSetttings node, but also the files, paths, and most importantly, the DataTable content. Loading the data from this XML file in the respective lists and the DataSet is easy:

    XmlDocument xd = new XmlDocument();
    xd.Load(cfgxmlpath);
    //use plain xml xpath for the rest
    m_paths.Clear();
    XmlNode xnpath = xd["configuration"]["paths"];
    if(xnpath!=null)
    {
        foreach(XmlNode xn in xnpath.ChildNodes)
        {
            m_paths.Add(xn.InnerXml);                        
        }    
    }
    XmlNode xnfile = xd["configuration"]["files"];
    if(xnfile!=null)
    {
        foreach(XmlNode xn in xnfile.ChildNodes)
        {
            m_files.Add(xn.InnerXml);                        
        }    
    }
    //use the data set
    m_extensions.Clear();
    _dataSet = new DataSet("configuration");
    DataTable mytable = new DataTable("maskRow");
    DataColumn exColumn = new DataColumn("maskField", 
        Type.GetType("System.String"), null, MappingType.Attribute);
    mytable.Columns.Add(exColumn);
    _dataSet.Tables.Add(mytable);
    _dataSet.Tables[0].ReadXml(MainForm.cfgxmlpath);
    for (int i = 0; i < _dataSet.Tables[0].Rows.Count; i++)
    {
        DataRow row = _dataSet.Tables[0].Rows[i];
        string val = row[0].ToString().ToLower();
        if (val.Length > 0)//no empty mask
        {
        //.....code eliminated for brevity
        }
        else
        {   //don't show empty rows
            row.Delete();
        }
    }
    _dataSet.Tables[0].AcceptChanges();

    Writing data into the config file

    Using WriteXml from the DataSet will eliminate the data that does not belong to the table. For this reason, we have to save it before calling WriteXml, and restore it afterwards:

    C#
    XmlDocument xd = new XmlDocument();
    //get the original
    xd.Load(MainForm.cfgxmlpath);
    //save nodes not part of the dataset
    XmlNode xnpath = xd["configuration"]["paths"];
    XmlNode xnfile = xd["configuration"]["files"];
    XmlNode lastFolderpath = xd["configuration"]["LastUsedFolder"];
    
    //write the masks
    _dataSet.WriteXml(MainForm.cfgxmlpath);
    
    //restore the old saved nodes
    xd.Load(MainForm.cfgxmlpath);
    if(xnpath != null)
        xd.DocumentElement.AppendChild(xnpath);
    if (xnfile != null)
         xd.DocumentElement.AppendChild(xnfile);
    if (lastFolderpath != null)
        xd.DocumentElement.AppendChild(lastFolderpath);
    xd.Save(MainForm.cfgxmlpath);

    Using the wild chars for the filter

    I'm not going to get into the details on this one, but as you've already noticed in the XML snippet, you can use the * and ? chars. It's a good idea that the first thing you do when you open this application is setting the configuration.

    Screenshot - config.png

    Explorer folder context menu

    I've added some new functionality in regards to using the context menu from Explorer. You should start the exe only once before you can right click on the folder and zip it.

    Screenshot - Menu.png

    History

    As a utility, I consider version 2.x to be an improvement over the old one. You can use it to some extent as a WinZip replacement, but it lacks features like encryption. .NET 2.0 and the J# package have to be installed on your machine to run it. If you have problems running the exe alone, it might be because you are missing the J# distribution package or the .NET 2.0 runtime. If that's the case, I recommend you try to install this MSI install file I've created or download vjredist_32bit.zip and install it locally.

    Version 2.3

    I've added email capability to the application using the Outlook wrapper and made a few small enhancements. I have also included the deployment project that creates the MSI file.

    Version 2.4

    I've forced the assembly to run in 32 bit mode to avoid some 64 bit issues and added some custom actions for the installer.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)