Introduction
When developing or testing a web application, it's sometimes necessary to manually view or delete files residing in the browser cache. I most often encounter this situation when working on JavaScript (.js) files used by my web pages, when I want to ensure that my test clients are using the latest revision. Unfortunately, finding files in the cache in the Explorer-like window provided by Internet Explorer is cumbersome and slow.
I have developed the WebCacheTool command-line utility to make it easier and faster to list, view, and delete files residing in the Internet Explorer browser cache. In this article, I will describe the usage and implementation of this tool.
Disclaimer: All of the warnings provided by Internet Explorer regarding the risks associated with pulling arbitrary and/or unfamiliar files out of the browser cache apply. In general, you should not copy files out of their temporary locations in the cache for use elsewhere on your system.
WebCacheTool Usage
WebCacheTool is a command-line utility suitable for use in batch files and scripts by both developers and system administrators. The syntax is:
WebCacheTool <command> <arguments>
Unless otherwise noted, the arguments are case-sensitive. The following commands are supported:
ls
Lists the cache entries matching a given pattern. The pattern is a case-insensitive .NET regular expression. The information provided on each cache entry includes:
- Source URL
- Size
- Last Access Time
- Last Modified Time
- Expiration Time
- File System Path
Examples:
WebCacheTool ls http://www.example.com/example.html$
Lists the specific named file, if present.
WebCacheTool ls \.example\.com
Lists any and all entries from *.example.com.
WebCacheTool ls \.example\.com.*\.gif$
Lists the .gif files from *.example.com.
WebCacheTool ls \.js$
Lists the .js files in the cache.
WebCacheTool ls .
Dumps the entire contents of the cache.
See the MSDN documentation on the .NET regular expression language for more information on constructing useful path expressions.
The underlying WinInet APIs (discussed in more detail below) also support retrieval of cookies and history items via the magic "Cookie:" and "Visited:" strings as follows:
WebCacheTool ls Cookie:
Lists all cookies on the system.
WebCacheTool ls Cookie:.*\.example\.com
Lists cookies from the example.com domain.
WebCacheTool ls Visited:
Lists all of the history items.)
info
Lists detailed information about one or more specific files. This function does not support regular expression patterns, but rather prints the same information as ls but in a format that is often easier to read.
Examples:
WebCacheTool info http://www.example.com/example.html
http://www.example.com/example.gif
Prints detailed information on the two files, if found.
WebCacheTool info "Cookie:john smith@www.example.com/"
Prints detailed information about john smith's cookie from www.example.com.
rm
Removes the specific file or files from the cache, if found.
Example:
WebCacheTool rm http://www.example.com/example.html
cat
Prints the contents of the specific file or files to standard output. Note that you should make a point of only executing this command on text files!
Example:
WebCacheTool cat http://www.example.com/example.html
help
Prints a short command summary.
WebCacheTool Implementation
WebCacheTool uses managed wrappers around the documented WinInet APIs. Unfortunately, these APIs are somewhat painful to put managed wrappers around, largely because they use a variable-length structure idiom. So while your first impression might be to make the PInvoke declarations include parameters of type INTERNET_CACHE_ENTRY_INFO
, what you really need to do is replace those parameters with IntPtr
s to globally allocated buffers and take charge of unmarshalling the structure yourself. Of course, that also implies that you must manage the globally allocated memory yourself, too, and doing that right has a side-effect of cluttering up the code. Here is probably the simplest example, my wrapper for the GetUrlCacheEntryInfo
API:
[DllImport("wininet.dll", SetLastError=true)]
private static extern bool
GetUrlCacheEntryInfo( string lpszUrlName,
IntPtr lpCacheEntryInfo,
out UInt32 lpdwCacheEntryInfoBufferSize );
public static INTERNET_CACHE_ENTRY_INFO
GetUrlCacheEntryInfo( string url )
{
IntPtr buffer = IntPtr.Zero;
UInt32 structSize;
bool apiResult = GetUrlCacheEntryInfo( url,
buffer, out structSize );
CheckLastError( url, true );
try
{
buffer = Marshal.AllocHGlobal( (int) structSize );
apiResult = GetUrlCacheEntryInfo( url,
buffer, out structSize );
if( apiResult == true )
{
return (INTERNET_CACHE_ENTRY_INFO)
Marshal.PtrToStructure( buffer,
typeof( INTERNET_CACHE_ENTRY_INFO ) );
}
CheckLastError( url, false );
}
finally
{
if( buffer.ToInt32() > 0 )
{
try { Marshal.FreeHGlobal( buffer ); }
catch{}
}
}
Debug.Assert( false, "We should either early-return" +
" or throw before we get here" );
return new INTERNET_CACHE_ENTRY_INFO();
}
Another aspect that these APIs force you to deal with is the Windows FILETIME
structure, which is how the last-accessed, last-modified, and expires times are exposed in the INTERNET_CACHE_ENTRY_INFO
structure. .NET 1.x defines System.Runtime.InteropServices.FILETIME
(System.Runtime.InteropServices.ComTypes
in .NET 2.0) to stand in for this. .NET also provides a DateTime.FromFileTime
factory method. Unfortunately, as I discuss in more detail in my blog, I find the .NET definitions to be flawed due to their incorrect use of signed types. Thus, I have defined my own FILETIME
struct for use in interop:
[StructLayout(LayoutKind.Sequential)]
public struct FILETIME
{
public UInt32 dwLowDateTime;
public UInt32 dwHighDateTime;
}
Along with the Win32 APIs FileTimeToSystemTime
and SystemTimeToTzSpecificLocalTime
:
[DllImport("Kernel32.dll", SetLastError=true)]
public static extern long FileTimeToSystemTime(ref FILETIME
FileTime, ref SYSTEMTIME SystemTime);
[DllImport("kernel32.dll", SetLastError=true)]
public static extern long SystemTimeToTzSpecificLocalTime(IntPtr
lpTimeZoneInformation, ref SYSTEMTIME lpUniversalTime,
out SYSTEMTIME lpLocalTime);
we now have a powerful replacement for the .NET FILETIME
manipulation routines. Note, however, that date-time's are always resolved relative to the current value of Daylight Savings Time, for reasons that are discussed in more detail here.
Future Directions and Possible Enhancements
- A .NET Framework 2.0 upgrade, in which the ls command would yield results matching the pattern rather than collecting them to return at the end. This would decrease memory consumption and latency. Note that I implemented ls in anticipation of this feature -- otherwise, I would have considered violating one of the command design principles by having the command itself process the output.
- Switches that control the format of the output, particularly for the ls command.
- Sorting options for ls output.
- Wildcard and/or pattern support for all commands.
History
- Initial release: 02/14/2006.