Introduction
C# and .NET have been hailed by Microsoft as the Windows programming environment of the future. Just what does that actually mean? Is the programmer shielded so much from Windows that attempting to do anything useful is difficult if not impossible? Is it just another VB? This article demonstrates a Windows Shell hook/extension in C#, demonstrating how easy it is to consume COM interfaces and to deploy the final code as though it is a bona-fida COM object.
Hooking into the Shell
One of the simplest forms of shell extension is to hook into all the ShellExecuteEx
Win32 calls made. Windows Explorer uses this function for almost everything it does. From the Start->Run dialog to double clicking on a file, they are all invoked using ShellExecuteEx
. To hook into this, the shell uses a chain-of-responsibility pattern and will call every registered COM component which implements the IShellExecuteHook
interface. The IShellExecuteHook
interface contains just one method:
HRESULT Execute(
LPSHELLEXECUTEINFO pei
);
To register a concrete implementation of this interface, we need to register the CLSID of our component in the ShellExecuteHooks
list which can be found in HKLM\Software\Microsoft\Windows\CurrentVersion\Explorer\ShellExecuteHooks
in the Windows Registry.
So, now that we know what the shell expects us to do, how do we do it from C#? First of all, we need to create a C# object which implements IShellExecuteHook
. Looking in the C++ include shlguid.h, we see DEFINE_SHLGUID(IID_IShellExecuteHookW, 0x000214FBL, 0, 0);
. Or in a more readable way, 000214FB-0000-0000-C000-000000000046. This gives us the GUID of the interface we need to implement. What we actually do now, is create a C# version of this interface in our code. We mark this implementation as being defined by COM using the special ComImport
attribute. Our C# interface now looks like:
[ComImport, InterfaceType(ComInterfaceType.InterfaceIsIUnknown),
Guid("000214FB-0000-0000-C000-000000000046")]
public interface IShellExecuteHook{
[PreserveSig()]
int Execute(SHELLEXECUTEINFO sei);
}
Notice the PreserveSig
attribute. This stops COM Interop from treating the return value as an out
param and uses the return value as the COM HRESULT
.
The Execute
method takes a single parameter, which is a SHELLEXECUTEINFO
structure. After a bit of experimentation, the following C# structure meets our requirements:
[StructLayout(LayoutKind.Sequential)]
public class SHELLEXECUTEINFO {
public int cbSize;
public int fMask;
public int hwnd;
[MarshalAs(UnmanagedType.LPWStr)]
public string lpVerb;
[MarshalAs(UnmanagedType.LPWStr)]
public string lpFile;
[MarshalAs(UnmanagedType.LPWStr)]
public string lpParameters;
[MarshalAs(UnmanagedType.LPWStr)]
public string lpDirectory;
public int nShow;
public int hInstApp;
public int lpIDList;
public string lpClass;
public int hkeyClass;
public int dwHotKey;
public int hIcon;
public int hProcess;
}
For our sample, we are only interested in fields up to lpParameters
, so as long as they are correct, everything will work fine.
We've now got a fully defined interface which is declared to be a COM interface. As a quick test, this should all compile fine. We're still missing a concrete implementation of this interface, so let's declare one:
[Guid("6156C6FC-4DD9-4f82-8200-0446DABB7F35"), ComVisible(true)]
public class DateParser : IShellExecuteHook {
}
Here, we've said that DateParser
(our shell extension) implements IShellExecuteHook
. The class is visible to COM and has the CLSID specified in the Guid
attribute. Filling in the single concrete method is just a case of writing plain C#. For this example, I've created a method which will recognize dates entered into the Start->Run dialog and show a messagebox with the ISO 8601 equivalent. The code is:
public int Execute(SHELLEXECUTEINFO sei) {
try {
DateTime oTime=DateTime.Parse(sei.lpFile + " " + sei.lpParameters);
MessageBox.Show(null, "Date '" + sei.lpFile + " " +
sei.lpParameters + "' in ISO 8601 is " +
oTime.ToString("s"), "ISO 8601 Date",
MessageBoxButtons.OK, MessageBoxIcon.Information);
return S_OK;
}
catch(FormatException) {
return S_FALSE;
} catch(Exception e) {
Console.Error.WriteLine("Unknown exception parsing Date: " + e.ToString());
}
return S_FALSE;
}
The shell treats the first space-delimited word in the command line as the filename and everything after the first space as parameters. We want to parse the whole command line, so we concatenate the filename and parameters together.
Registration
Now that we have our COM implementation, it needs to be registered. COM components have always needed registering, so this should come as no surprise; but this is .NET, so our components need to be registered in two places. The way it works is that your COM class is registered in the usual place, under HKCR\clsid
and the InProcServer32
entry is set to mscoree.dll. In the same way that MSJava and MTS added proxies to intercept COM instantiation, the CLR needs to intercept this too. Once the CLR has control, it uses the value in the progid
key to find your .NET assembly and instantiate your class.
The progid key is in the following format: <namespace>.<class>. This is obviously a problem, because nowhere is there a file path to your assembly on disk. To allow the CLR to find your code, it needs to be registered. This registration location is called the Global Assembly Cache, or GAC. It's a location where all shared assemblies must be placed, and it resides in a series of folders underneath %windows%\assembly. You can inspect this through Windows Explorer and you'll see a nice, friendly list of all the assemblies placed into the GAC. Under the surface (through DOS), you can see how they are really stored. The GAC has the codename fusion during development and this name appears in several places, such as DLL names.
Registry Registration
Installing to the registry is pretty easy:
Assembly asm=Assembly.GetExecutingAssembly();
RegistrationServices reg=new RegistrationServices();
reg.RegisterAssembly(asm, 0);
This instructs the CLR to add the appropriate entries into HKCR\clsid
and to call back into our code at predefined registration functions. We declare our registration function with the ComRegisterFunction
attribute, and declare any un-registration function with the ComUnregisterFunction
attribute:
[System.Runtime.InteropServices.ComRegisterFunctionAttribute()]
static void RegisterServer(String zRegKey) {
try {
RegistryKey root;
RegistryKey rk;
root = Registry.LocalMachine;
rk = root.OpenSubKey(@"Software\Microsoft\Windows\"
"CurrentVersion\Explorer\ShellExecuteHooks", true);
rk.SetValue(clsid, ".Net ISO 8601 Date Parser Shell Extension");
rk.Close();
}
catch(Exception e) {
System.Console.Error.WriteLine(e.ToString());
}
}
In addition to the standard HKCR\clsid
registration, we need to register ourselves as a ShellExecuteHook
simply by adding the CLSID as a new value.
GAC Install
Installing to the GAC can be done in several ways. The easiest is to simply drag and drop into the %windows%\assembly folder. Another way is to use the GACUtil.exe tool provided with the .NET SDK. For an end user, neither of these methods are particularly intuitive, so we elect to install it programmatically. Included with a couple of the samples in the SDK is a file called fusioninstall.cs. As the name implies, this provides a couple of functions which use PInvoke to install an assembly in the GAC. Doing this is as simple as:
if (FusionInstall.AddAssemblyToCache("DateParser.exe") == 0) {
Console.WriteLine("DateParser - shell extension successfully registered");
}
Strong Naming
There is one final point I've missed from GAC installation. To prevent name clashes, the GAC requires each assembly to have a strong name. That is, a name constructed from a public key and assembly version attributes. The current VB.NET IDE provides a nice property page to add a strong name, whereas with C#, it has to be done manually. First of all, a public/private key pair is generated using sn.exe -k, the strong name utility in the SDK. Then, this key pair should be referenced in code using the key attribute:
[assembly: AssemblyKeyFile(@"..\..\KeyFile.snk")]
Finally, registering the built executable is simply a case of double-clicking the EXE. This will register the component, and Windows Explorer will load it into its process space and issue every ShellExecuteEx
call to it. Our sample will parse every string passed to it. If it detects it is a date, a MessageBox
will appear containing the ISO 8601 standard form of the date. Try typing 25 December 2001 into the Start->Run dialog to see the MessageBox
. To un-register the component, run the EXE with /u as a command line switch. This won't free the DLL from within the Explorer process - you will need to kill Explorer from Task Manager to do that.
Conclusion
It's amazing just how short the finished code is. Probably shorter than the equivalent ATL C++ code. It shows that C# and .NET have very good COM interop and legacy integration story and could well become the preferred means of shell programming, taking over from C++.