Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Reducing the Size of .NET Applications

0.00/5 (No votes)
4 Oct 2004 2  
An article on reducing size of .NET executables.

Introduction

In this article, I will explain how to reduce the size of a .NET application by zipping the application and unzipping it in memory at run time using a pure .NET solution. Compressors for binary files like [UPX] use the same technique. However, they do not work with pseudo EXE files of .NET. As we will see, it is relatively easy to pack a .NET application, because we can make use of the .NET built-in support for metadata and reflection.

A free open source tool written by me called [.NETZ] fully automates all the steps explained in this article so you do not need to write any code by yourself. Read this article only if you are interested in how .NETZ tool works.

First, I will explain how the technique works for packing the main EXE file. For this article I will suppose we have a .NET EXE application named 'app.exe' whose size we need to reduce. It can be a self-contained EXE file or a file that depends on other DLLs. Then I will show how DLLs can be handled. Finally, I will briefly mention an alternate solution. The example code here is in C#.

Selecting a Compression Library

The first thing we need is to select a suitable compression library. I have used #ziplib [ZLIB], which supports (among the others) the zip standard format. We can use any zip compatible program to zip our application. For example, the following command-line zip utility command can be placed in the build batch file:

pkzip25 -add app.zip app.exe
The .NETZ tool however does the zipping step programmatically.

Depending on the program, we may have up to 60% or more size reduction in 'app.zip'. Once we have zipped the application, we need to write a small starter application in C# to unzip and start the application at run-time. This means that we need #ziplib to be available when we unzip the application. The good news is that #ziplib source code is available online. If we choose to use only the usual zip format, then we can remove all the unnecessary classes that support zipping and other formats from the library, and leave only the unzip code. If we do this, the size of #ziplib library will be reduced to only 60 KB (the compiled �zip.dll� file). This 60 KB is the only size overhead of this method, because we need to distribute the unzip library with the application starter.

If the �app.exe� is over 200 KB, which is normally the case for a small GUI exe, then the size of �app.zip� plus the unzip lib will still be under 200 KB. We could also use another faster and smaller zip library. It has not to be written in any of the .NET languages; a native library can be accessed using PInvoke. However it should support uncompressing Byte[] arrays or System.IO.Stream objects so we can process the data in memory.

Packing the Data

The next step is to write a starter application and pack �app.zip� as part of it. The easiest way to do this in .NET, is to pack �app.zip� as a resource file. I will show only example code in this article to keep it simple. It you are interested in the real code I recommend that you have a look at the .NETZ tool source code. Code similar to the following will produce a valid .NET resource file �app.resources�:

FileStream fs = new FileStream("app.zip", FileMode.Open, FileAccess.Read);
byte[] data = new byte[fs.Length];
fs.Read(data, 0, data.Length);
fs.Close();

ResourceWriter rm = new ResourceWriter("app.resources");
rm.AddResource("appdata1", data);
rm.Close();

Note that we have given a name 'appdata1' to the resource to access it later.

The Starter Application

The starter application �starter.exe� loads the resource file, gets the data, unzips it on memory, and uses reflection to start the �app.exe� application. The code invoked by Main(string[] args) will look like the following:

ResourceManager rm = new ResourceManager("app", this.GetType().Assembly);
byte[] data = (byte[])rm.GetObject("appdata1");

We have to unzip the application data in memory. The code for #ziplib is:

string zipPath = "app.exe";
MemoryStream zipFile = new MemoryStream(data);

ZipFile zf = new ZipFile(zipFile);
ZipEntry ze = zf.GetEntry(zipPath);
Stream zs = zf.GetInputStream(ze);
byte[] uzdata = new byte[ze.Size];
sz.Read(uzdata, 0, uzdata.Length);

Then we can create an assembly from the byte array:

Assembly assembly = Assembly.Load(uzdata);

Once we have an assembly, the easiest thing to do is to invoke its entry point, which corresponds the Main(string[] args) method in the original �app.exe�, passing it the original command line arguments passed to the Main(string[] args) method of the starter:

assembly.EntryPoint.Invoke(null, new object[]{args});

Alternatively, we can rely on reflection code to find the types in the assembly and invoke methods on them. This can be useful when the �app.exe� has no entry point, or when we want to invoke any other methods. The startup time could be smaller than starting �app.exe� directly, because of lower disk overhead.

To compile �starter.exe� from �starter.cs�, we will use the following command (supposing it is a Win exe):

csc /t:winexe /out:starter.exe starter.cs AssemblyInfo.cs
    /r:zip.dll /res:app.resources /win32icon:App.ico

The 'App.ico' file can be extracted from the original 'app.exe'; 'AssemblyInfo.cs' can be generated using reflection information from 'app.exe'.

We can rename �starter.exe� back to �app.exe� later, if we like. This way, we distribute �starter.exe� and �zip.dll� which are both smaller in size than �app.exe� alone.

The .NETZ tool handles all these steps automatically. It uses System.CodeDom.Compiler.ICodeCompiler interface with CSharpCodeProvider to compile the starter code programmatically.

Handling DLL-s

If the 'app.exe' depends on other DLLs, we normally do not need to do anything. However, sometimes we may like to zip also the DLL files. The technique we will describe here works only for applications that make use of .NET XCOPY paradigm, that is when DLLs are used by a single application. This technique will not work if the DLLs are placed in GAC, or shared by more than one application which is not aware of the technique.

Let us suppose �lib.dll� is a DLL file required by �app.exe� we like to zip. First, we would link �app.exe� with the normal unzipped version of �lib.dll� as we normally do.

.NET has a built in mechanism for resolving types and assemblies. However, when it fails, we can provide .NET with an assembly. This functionality is exposed by a hook in the System.AppDomain class. We need to handle the following event:

AppDomain currentDomain = AppDomain.CurrentDomain;
currentDomain.AssemblyResolve += new ResolveEventHandler(MyResolveEventHandler);

This code need to be placed into the Main(...) method of the starter application. The trick for this event to be activated is to place the �app.exe� assembly activation code described above in another separate method that will be called by the starter�s Main method.

After we zip the �lib.dll� into �lib.zip�, we may also pack it as a resource file with the starter application, like we did with the �app.zip�. This can be preferable if we want to have a single exe file. Otherwise, we may leave it as a separate file. However, we need to rename the file to something different from �lib.dll�, given that .NET will look for this name, and it will look like a corrupted file to .NET. We can leave the name �lib.zip�, or be creative and rename �lib.zip� to �lib.dllz�. Alternatively, we can save the �lib.zip� data in a SQL database table and retrieve it from there, etc.

The code to activate the DLL in MyResolveEventHandler will look like the following. Here, we suppose that the zipped DLL is a file in the same directory as the starter application:

public static Assembly MyResolveEventHandler(object sender, ResolveEventArgs args)
{
    int i = args.Name.IndexOf(',');
    string dllName = args.Name.Substring(0, i);

    // the dllName will equal "lib" in our example

    // we map it to the zipped file name

    dllName += ".dllz";

    // read the file and unzip the data as above

    // code omitted ...

    byte[] uzdata = ...

    return Assembly.Load(uzdata);
}

This way, the types found in the DLL will be resolved to the AppDomain.

The .NETZ tool�s starter code has logic to search the compressed DLL files in the same way as .NET searches normal DLL files. It also supports private DLL paths.

Another Alternative

Another possible technique to pack .NETZ executables would be to rely on native platform code.

.NET itself uses a technique similar to the [UPX] to create native images. The only difference is that the CLR data segment is not zipped. Every time we run a .NET pseudo EXE file or access a DLL, the information in the PE header [COFF] is read and the CLR data segment is extracted. The CLR is then initialized and the data segment is given to it as an assembly byte array. For EXE files, this work is done by calling the _CorExeMain function in 'mscoree.dll'. For DLL files, a similar function _CorDllMain is called inside the usual DllMain [SSCLI]. If the CLR data segment is zipped then these two functions of the 'mscoree.dll' need to be changed to unzip the data before passing it to the CLR.

This can be done by a third party. In this case, we need to unzip the CLR data segment and then modify the EXE header, so the unzipped data be in the same locations as expected by _CorExeMain and _CorDllMain. However, it would be better if Microsoft supported this as an option in the future. This would make .NET pseudo EXE files have size comparable to the Java JAR files. This technique would also work with DLLs placed in GAC.

Closing Up

In this article I demonstrated a technique used in the .NETZ tool to compress .NET executable files. If you need more details download the latest version of .NETZ and have a look at its source code.

History

  • 16 Jul 2004
    • Added some code to show how to make a stream from a byte array, before using ZipFile.
    • Corrected some spelling errors :)
  • 20 Jul 2004
    • Showed how to process DLL files etc., to reflect readers' comments.
  • 29 Jul 2004
    • .NETZ tool version 0.1.1 source code released.
    • Minor errors in the code samples corrected.
  • 01 Oct 2004
    • Minor corrections in article.

References

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here