Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

WebCacheTool: Manipulate the IE Browser Cache From the Command-Line

0.00/5 (No votes)
23 Feb 2006 1  
This article presents the WebCacheTool, a command-line utility to make it easier and faster to list, view, and delete files residing in the Internet Explorer browser cache.

Introduction

When developing or testing a web application, it's sometimes necessary to manually view or delete files residing in the browser cache. I most often encounter this situation when working on JavaScript (.js) files used by my web pages, when I want to ensure that my test clients are using the latest revision. Unfortunately, finding files in the cache in the Explorer-like window provided by Internet Explorer is cumbersome and slow.

I have developed the WebCacheTool command-line utility to make it easier and faster to list, view, and delete files residing in the Internet Explorer browser cache. In this article, I will describe the usage and implementation of this tool.

Disclaimer: All of the warnings provided by Internet Explorer regarding the risks associated with pulling arbitrary and/or unfamiliar files out of the browser cache apply. In general, you should not copy files out of their temporary locations in the cache for use elsewhere on your system.

WebCacheTool Usage

WebCacheTool is a command-line utility suitable for use in batch files and scripts by both developers and system administrators. The syntax is:

WebCacheTool <command> <arguments>

Unless otherwise noted, the arguments are case-sensitive. The following commands are supported:

ls

Lists the cache entries matching a given pattern. The pattern is a case-insensitive .NET regular expression. The information provided on each cache entry includes:

  • Source URL
  • Size
  • Last Access Time
  • Last Modified Time
  • Expiration Time
  • File System Path

Examples:

WebCacheTool ls http://www.example.com/example.html$

Lists the specific named file, if present.

WebCacheTool ls \.example\.com

Lists any and all entries from *.example.com.

WebCacheTool ls \.example\.com.*\.gif$

Lists the .gif files from *.example.com.

WebCacheTool ls \.js$

Lists the .js files in the cache.

WebCacheTool ls .

Dumps the entire contents of the cache.

See the MSDN documentation on the .NET regular expression language for more information on constructing useful path expressions.

The underlying WinInet APIs (discussed in more detail below) also support retrieval of cookies and history items via the magic "Cookie:" and "Visited:" strings as follows:

WebCacheTool ls Cookie:

Lists all cookies on the system.

WebCacheTool ls Cookie:.*\.example\.com

Lists cookies from the example.com domain.

WebCacheTool ls Visited:

Lists all of the history items.)

info

Lists detailed information about one or more specific files. This function does not support regular expression patterns, but rather prints the same information as ls but in a format that is often easier to read.

Examples:

WebCacheTool info http://www.example.com/example.html 
                  http://www.example.com/example.gif

Prints detailed information on the two files, if found.

WebCacheTool info "Cookie:john smith@www.example.com/"

Prints detailed information about john smith's cookie from www.example.com.

rm

Removes the specific file or files from the cache, if found.

Example:

WebCacheTool rm http://www.example.com/example.html

cat

Prints the contents of the specific file or files to standard output. Note that you should make a point of only executing this command on text files!

Example:

WebCacheTool cat http://www.example.com/example.html

help

Prints a short command summary.

WebCacheTool Implementation

WebCacheTool uses managed wrappers around the documented WinInet APIs. Unfortunately, these APIs are somewhat painful to put managed wrappers around, largely because they use a variable-length structure idiom. So while your first impression might be to make the PInvoke declarations include parameters of type INTERNET_CACHE_ENTRY_INFO, what you really need to do is replace those parameters with IntPtrs to globally allocated buffers and take charge of unmarshalling the structure yourself. Of course, that also implies that you must manage the globally allocated memory yourself, too, and doing that right has a side-effect of cluttering up the code. Here is probably the simplest example, my wrapper for the GetUrlCacheEntryInfo API:

[DllImport("wininet.dll", SetLastError=true)]
private static extern bool 
        GetUrlCacheEntryInfo( string lpszUrlName, 
        IntPtr lpCacheEntryInfo, 
        out UInt32 lpdwCacheEntryInfoBufferSize );

public static INTERNET_CACHE_ENTRY_INFO 
       GetUrlCacheEntryInfo( string url )
{
    IntPtr buffer = IntPtr.Zero;
    UInt32 structSize;
    bool apiResult = GetUrlCacheEntryInfo( url, 
                     buffer, out structSize );
    CheckLastError( url, true );

    try
    {
        buffer = Marshal.AllocHGlobal( (int) structSize );
        apiResult = GetUrlCacheEntryInfo( url, 
                          buffer, out structSize );
        if( apiResult == true )
        {
            return (INTERNET_CACHE_ENTRY_INFO) 
                    Marshal.PtrToStructure( buffer, 
                    typeof( INTERNET_CACHE_ENTRY_INFO ) );
        }

        CheckLastError( url, false );
    }
    finally
    {
        if( buffer.ToInt32() > 0 )
        {
            try { Marshal.FreeHGlobal( buffer ); }
            catch{}
        }
    }

    Debug.Assert( false, "We should either early-return" + 
                         " or throw before we get here" );
    return new INTERNET_CACHE_ENTRY_INFO();
    // Make the compiler happy even though

    // we never expect this code to run.

}

Another aspect that these APIs force you to deal with is the Windows FILETIME structure, which is how the last-accessed, last-modified, and expires times are exposed in the INTERNET_CACHE_ENTRY_INFO structure. .NET 1.x defines System.Runtime.InteropServices.FILETIME (System.Runtime.InteropServices.ComTypes in .NET 2.0) to stand in for this. .NET also provides a DateTime.FromFileTime factory method. Unfortunately, as I discuss in more detail in my blog, I find the .NET definitions to be flawed due to their incorrect use of signed types. Thus, I have defined my own FILETIME struct for use in interop:

[StructLayout(LayoutKind.Sequential)]
public struct FILETIME
{
    public UInt32 dwLowDateTime;
    public UInt32 dwHighDateTime;
}

Along with the Win32 APIs FileTimeToSystemTime and SystemTimeToTzSpecificLocalTime:

[DllImport("Kernel32.dll", SetLastError=true)]
public static extern long FileTimeToSystemTime(ref FILETIME 
                        FileTime, ref SYSTEMTIME SystemTime);

[DllImport("kernel32.dll", SetLastError=true)]
public static extern long SystemTimeToTzSpecificLocalTime(IntPtr 
       lpTimeZoneInformation, ref SYSTEMTIME lpUniversalTime, 
       out SYSTEMTIME lpLocalTime);

we now have a powerful replacement for the .NET FILETIME manipulation routines. Note, however, that date-time's are always resolved relative to the current value of Daylight Savings Time, for reasons that are discussed in more detail here.

Future Directions and Possible Enhancements

  • A .NET Framework 2.0 upgrade, in which the ls command would yield results matching the pattern rather than collecting them to return at the end. This would decrease memory consumption and latency. Note that I implemented ls in anticipation of this feature -- otherwise, I would have considered violating one of the command design principles by having the command itself process the output.
  • Switches that control the format of the output, particularly for the ls command.
  • Sorting options for ls output.
  • Wildcard and/or pattern support for all commands.

History

  • Initial release: 02/14/2006.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here