Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Automating Windows Applications

0.00/5 (No votes)
1 Feb 2003 4  
A Windows application that does not export any program interface, may be converted to automation server with COM object(s) injected into the application process.

Sample Image

Introduction

Windows applications that do not export any program interface may be, however, converted to an automation server. The term "Automation" in this context means that clients running in separate processes are able to

  • control converted to server application,

  • get some services form it in synchronous and/or asynchronous (by receiving event notifications) modes.

This is useful particularly in the case of old legacy and the third party applications. Their functionality and user interface may be upgraded, and their services become available for another applications. This may be achieved with COM object(s) injected to the process. The DLL injection technique in Win32 NT was described by Jeffrey Richter [1] and has already become almost conventional. Now we try to move a step further: to embed a COM object into a target application (now "promoted" to server) and, using this component, communicate with the application from outside of its process.

Brief Technique Description

Let's assume for simplicity that our target application is running in just one thread. This application has one main frame window having one client (view) window. To automate such an application, the following steps should be taken.

  • A special Loader application injects a Plugin DLL into the target process using a remote thread. This is a worker thread, and its function is DllMain() of the Plugin DLL. After DllMain() returns, the thread will be terminated.
  • The Plugin DLL contains also callback window procedures to subclass the frame and view windows of a target application. These window procedures include message handlers with new functionality for the target application. DllMain() of the Plugin DLL actually performs the subclassings.
  • New window procedure for, say, a frame window has a handler for an additional Plugin-specific Windows message, WM_CREATE_OBJECT. DllMain() performs the target application window's subclassing, and after that posts the WM_CREATE_OBJECT message to a frame window. Then DllMain() returns.
  • Upon receiving the WM_CREATE_OBJECT message, a new frame window procedure creates a COM object and calls its method for registration with Running Object Table (ROT). If asynchronous notification to client(s) is required then the embedded COM object should support the Connection Point mechanism and define outgoing sink interface in its IDL file. It is preferable to have the outgoing interface being an IDispatch-based (or dual) one to allow a script client to implement it.  
  • Client applications obtain a proxy of the injected COM object via ROT, thus gaining control over the target application. To subscribe for the server's events notification, the clients should implement sink interface and marshal ("advise") its pointer to an embedded object. 

To accomplish automation task information about the target application, the window class name is required. It may be revealed by using Spy utility, which is part of the Visual Studio installation. Another useful tool is ROT Viewer. It is also a Visual Studio utility. ROT viewer allows developer to inspect the ROT and therefore check the proper registration of a COM object embedded into the target process.

Test Sample

A well-known Notepad text editor is taken as a target application for automation.

The Loader.exe application is responsible for the injection of the NotepadPlugin.dll in Notepad.exe. Loader finds a running instance of Notepad (or starts a new instance in case no running instance is available), obtains its process handle, and actually performs the injection of NotepadPlugin.dll using a remote thread (please refer to [1, 2] and code sample for details).

A NotepadPlugin.dll is to be injected into Notepad.exe. Its DllMain() method first finds the frame and view windows of Notepad.exe and then subclasses both of them with appropriate custom windows procedures. As it was stated above, DllMain() is running not in the Notepad.exe main thread, but in an additional thread created remotely by the Loader application. This additional thread vanishes when NotepadPlugin.dll is completely loaded. A custom WM_CREATE_OBJECT Windows message is posted to the frame window to initiate creation of the COM object for automation and its registration with ROT. The message WM_CREATE_OBJECT handler of a new frame window procedure initializes COM, creates NotepadHandler COM object, and calls its appropriate method for its ROT registration.

A COM in-process server component, NotepadHandler.dll, implements the dual-interface IHandler, specially tailored for our custom Notepad automation. IHandler has methods to register/unregister the object with the ROT. The outgoing source dual interface IHandlerEvents is added to the NotepadHandler project (in file NotepadHandler.idl). The interface contains method HRESULT SentenceCompleted([in] BSTR bsText). The NotepadHandler component implements the connection point mechanism (appropriate code may be added by activating of the Implement Connection Point... item in the right-click menu on CHandler in ClassView tab of Visual Studio workspace and the dialog followed).

According to my scenario (just for illustration), the server (embedded COM object) picks all Notepad editor keyboard input symbols (including non-alphabetic ones). As soon as one of the end-of-the-sentence symbols (".", "?", and "!" characters) appears, the server Fire_SentenceCompleted() event, thus providing all the event's subscribers with the sentence completed (content of appropriate buffer).

Below two variants of clients, for Win32 and .NET platforms are discussed. Each of them supports two ROT registration approaches: ActiveObject and moniker registration. Registration details may be seen with the ROT Viewer tool.

Win32 Client

The AutomClient.exe application is a sample of an Win32 automation client. It creates a proxy for the NotepadHandler COM object by using the component's ROT registration and actually controls the automated instance of Notepad through the IHandler interface implemented by NotepadHandler.

The client application process should implement the IHandlerEvents interface. It may be done in various ways. I choose to create a special COM component (in-process server), Notification. To construct it, the additional project NotepadHandlerNotification was added to the workspace with the ATL COM AppWizard. Then an ATL object was inserted into the component, using the right-click menu. This object implements additional an interface that I called INotification. This interface is not a compulsory one, but may be useful to provide the component with data from the client application via appropriate methods (I supply to the component handle of the client application's main window that way). The CNotification class implements IHandlerEvents interface (by activating the Implement Interaface... right-click menu item followed by the corresponding dialog). A client application class CAdvise contains advising mechanism. The method CAutomationClientDlg::OnButtonAutomate() contains code responsible for advising.

Implemented in the Notification component, the SentenceCompleted() method of the IHandlerEvents interface is called by the server. The application main window handle (in this case) is supplied to the Notification component with the SetMainWnd() method of the INotification interface. Having this handle in its possession, the SentenceCompleted() method posts a notification message to the main window of the client application, passing a pointer to a buffer with data received from the server.

By default the ActiveObject registration is used. To switch to moniker registration, definition of _MONIKER should be uncommented in file .\Include\DefineMONIKER.h before compilation.

.NET Client

The AutomClientNET.exe application is a sample of a .NET automation client. From user perspective it acts similarly to AutomClient.exe application. AutomClientNET.exe application uses RotHelper assembly  classes to create .NET Interop proxy for the NotepadHandler COM object. The RotHelper and NotepadHandler Interop (as NOTEPADHANDLERLib) have to be added to Referances of AutomClientNET project. This is done using standard VS.NET Add Reference... dialog: RotHelper through Projects tab, and through NOTEPADHANDLERLib selecting the NotepadHandler Type Library in COM tab.

By default the ActiveObject registration is used. To switch to moniker registration, definition of _MONIKER should be uncommented (in the beginning of file .\AutomationClientNET\NotepadClientForm.cs) before compilation.

Running the Test

An already compiled demo is available for the test sample. Please note that before you run it, the NotepadHandler.dll and NotepadHandlerNotification.dll COM components has to be registered with the regsvr32.exe utility. For the registration, you have to run the file Register Components.bat, located in the demo directory. Then one or more copies of AutomationClient.exe and AutomationClientNET.exe may be started.

By pressing the NOTEPAD AUTOMATE button, client internally starts the Loader.exe application. The latter starts Notepad and automates it. A word [Automated] appears in caption of Notepad main window. Alternatively, you may run Notepad manually before the AutomationClient.exe. In this case, Loader automates the already running instance of Notepad. As soon as Notepad has been automated client subscribes to its Sentence Completed event.

Now you may type some text in the automated Notepad instance and press the Copy Text button of the client application. The last text fragment you typed will appear in the edit box of the client application. To simplify things, only actually typed characters and symbols are copied, not copied-and-pasted text. Pressing the Find and Append Menu buttons of the client causes corresponding Notepad response.

If user types some characters followed by a sentence conclusion sign (".", "?", or "!"), the character sequence will be reproduced in an edit box of all clients' applications subscribed to the event. The automated Notepad sends ("fires") the Sentence Completed event to its subscribers (clients) as soon as any character from the set .?! is input. Upon this event, the appropriate buffer content is sent to subscribers and displayed by them.

On exit from Notepad, a "Bye-Bye..." message box is generated by NotepadPlugin.dll on behalf of Notepad to amuse user :) .

Compiling the Test

The test sample consists of NotepadAuto workspace for Visual C++ 6.0 and AutomationClientNET solution for Visual Studio .NET (files NotepadAuto.dsw and AutomationClientNET.sln are located in main directory of the source). VC++ 6.0 stuff is completely independent from .NET solution and may be tested separately.

First, file NotepadAuto.dsw should be loaded to VC++ 6.0 Studio and _Build_All_Projects project should be built. Then, if .NET client is of your interest, file AutomationClientNET.sln has to be loaded to VS.NET and its projects should be built (be sure that references to RotHelper and NOTEPADHANDLERLib interop are added to references of AutomationClientNET project).

The psapi library contains EnumWindows() and EnumProcesses() functions is required for a system-wide search for particular windows and processes. So, if psapi.dll file is not installed in your system directory then it should be copied from .\Psapi_library directory either to system directory, or it should be made loadable in some other way (e.g., by coping to directories available through environmental path variable, working directories or with Visual Studio's Tools->Option dialog->Directories). It is also assumed that Notepad.exe is located in your [Windows]\System32 directory. After the above arrangements have been made, you may build the _Build_All_Projects project (the rest are its dependants) and run the sample.

The AutomationClient.exe application should be run to test the Win32 sample. Please note that sample should be tested outside Visual Studio. The AutomationClientNET.exe application should be run to test the .NET sample. This test may be performed from VS.NET too.

Some Ways for Further Development

Design Modification Description
User interface (UI) upgrade The frame and view windows of a target application may be subclassed by an MFC-aware plugin DLL. This allows considerable UI upgrade, like, for example, adding toolbars to an old-style application.
Actions Sequence The client may implement a sequence of commands given to the target application. This sequence could be implemented with some scripting.
Configurator An object embedded into the target process may serve as an "objects factory." With its help, another COM object(s) may be created within the target process to accomplish various specific tasks. The object factory component obtains data for new objects' construction through its methods and/or Registry.
Part of distributed system Automated applications may be included to distributed systems (see e.g., [3]).

Conclusion

A technique for automation of Windows applications that do not export program interface is presented. Usage of COM objects in a DLL injected into a target process allows developer to automate such applications. This approach is useful to upgrade functionality and GUI of target applications (particularly, legacy applications) and expose their services to out-process clients. Source sample demonstrates such an automation for both Win32 (MFC) and .NET (C#) clients.

References

[1] Jeffrey Richter. Advanced Windows. Third edition. Microsoft Press, 1997.
[2] John Peloquin. Remote Library Loading.
[3] Dino Esposito. Add Object Models to Non-COM Apps.

Update

The .NET client is added. ROT registration of injected COM object may be carried out with ActiveObject or moniker. Text of the article is changed to reflect changes in code.

Thanks

My thanks to all people who read this article and expressed their opinion and suggestions. My special thanks to Alex Furman for his refinement of RotHelper.Moniker class.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here