Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / HTML

Automate the Active Windows Explorer or Internet Explorer Window

4.94/5 (18 votes)
20 Oct 2005Ms-PL7 min read 3   4.7K  
An article on finding out an active IE or Explorer window or creating one and controlling it.

It has been a long time since I started working on automation shell windows, mainly Internet Explorer windows. There are times the WebBrowser control or the MFC class CHTMLView would satisfy my needs, but often I need to scratch my head, start embedding the WebBrowser control from scratch and then simulate as many IE behaviors as I can, such as implementing IDocHostUIHandler to enable AutoComplete in the WebBrowser control. A natural alternative is, well, why not just start automating an Internet Explorer window?

A new Internet Explorer window

The simplest way to accomplish this is by calling ShellExecute (Ex), as Paul DiLascia demonstrated in his C++ Q&A article "Browser Detection Revisited, Toolbar Info, IUnknown with COM and MFC":

/// As I've shown in many programs...
ShellExecute(0, _T("open"), pszMyHTMLFile, 
                           0, 0, SW_SHOWNORMAL);

However, I have no control over the new window, and I will leave the user with an IE window after the program is closed. To clean up my mess, I need to first find out which window is mine and then take over the new window after creating it.

Another approach is to create and automate an InternetExplorer Object, and close it when necessary. There is an article "How To Automate Internet Explorer to POST Form Data" in Microsoft Knowledge Base, and it describes basically what I want, except the final clean up. Well, a simple call to IWebBrowser2::Quit will do that.

// create a new IE instance and show it 
//CComQIPtr<IWebBrowser2> m_pWebBrowser2;
m_pWebBrowser2.CoCreateInstance(CLSID_InternetExplorer);
HRESULT hr;
hr = m_pWebBrowser2->put_StatusBar(VARIANT_TRUE);
hr = m_pWebBrowser2->put_ToolBar(VARIANT_TRUE);
hr = m_pWebBrowser2->put_MenuBar(VARIANT_TRUE);
hr = m_pWebBrowser2->put_Visible(VARIANT_TRUE);

if(!::PathIsURL(m_strFileToFind))
    m_strFileToFind=_T("http://blog.joycode.com/jiangsheng");
COleVariant vaURL( ( LPCTSTR) m_strFileToFind);
m_pWebBrowser2->Navigate2(
    &vaURL, COleVariant( (long) 0, VT_I4),
    COleVariant((LPCTSTR)NULL, VT_BSTR),
    COleSafeArray(),
    COleVariant((LPCTSTR)NULL, VT_BSTR)
);
void CAutomationDlg::OnDestroy()
{
    //close the IE window created by this 
    //program before exit
    if(m_pWebBrowser2)
    {
        if(m_bOwnIE)
        {
            m_pWebBrowser2->Quit();
            m_bOwnIE=FALSE;
        }
        UnadvisesinkIE();
        m_pWebBrowser2=(LPUNKNOWN)NULL;
    }
    CDialog::OnDestroy();
}

Just one more question. What if the user closed the new IE window a few seconds before I could automate the window in a WM_TIMER handler function? The IE object exposing IWebBrowser2 now does not exist. Fortunately, the program won't crash, thanks to Microsoft, but it would be better if I knew when it closes, so I could avoid unexpected results.

Handling Internet Explorer events

The Internet Explorer object fires the DWebBrowserEvents2::OnQuit Event when it terminates. This is the ideal time to release the IWebBrowser2 interface pointer. Because the object is dying, I will stop monitoring its events:

if(m_pWebBrowser2) 
{
    UnadvisesinkIE();
    m_pWebBrowser2=(LPUNKNOWN)NULL;
}

Connect to the current Internet Explorer window

It is not my character to do something useless, but it is my character to make things perfect. Although I don't care which window I should connect to, since an article named "How to connect to a running instance of Internet Explorer" exists in Microsoft Knowledge Base, I suppose something like "How to connect to the current instance of Internet Explorer" would be more useful.

Then, what is the current instance of Internet Explorer? Well, it is the latest active IE window. Since Microsoft Windows will bring the active window to the top of the z-order, it will remain at the top of z-order among all IE windows. Therefore, what I have to do is to find out which IE window has the highest z-order value. So, I need to figure out first which window is an IE window. After some investigation using Spy++, I assume the window class name of the IE windows is "IEFrame", and I write a function to get the window class name of a shell window:

//shell windows object will list both IE and Explorer windows
//use their window class names to identify them.   
CString CAutomationDlg::GetWindowClassName(IWebBrowser2* pwb)    
{    
    TCHAR szClassName[_MAX_PATH];    
    ZeroMemory( szClassName, _MAX_PATH * sizeof( TCHAR));    
    HWND hwnd=NULL;    
    if (pwb)    
    {    
        LONG_PTR lwnd=NULL;    
        pwb->get_HWND(&lwnd);    
        hwnd=reinterpret_cast<HWND>(lwnd);    
        ::GetClassName( hwnd, szClassName, _MAX_PATH);    
    }    
    return szClassName;    
}

And the rest of this problem is simple: enumerate the top-level windows through the z-axis and find the first instance with the window class name "IEFrame" which is also in the shell window list. After that, I do something tricky to play with the IE DHTML Document Object Model (or DOM, which is available after the IE window fires the last DocumentComplete event) to ascertain that the window is attached successfully:

void CAutomationDlg::DocumentComplete(IDispatch *pDisp, 
                                             VARIANT *URL)
{
    //HTML DOM is available AFTER the 
    //DocumentComplete event is fired.   
    //For more information, please visit KB article
    //"How To Determine When a Page Is 
    //Done Loading in WebBrowser Control"
    //http://support.microsoft.com/kb/q180366/
    CComQIPtr<IUnknown,&IID_IUnknown> pWBUK(m_pWebBrowser2);
    CComQIPtr<IUnknown,&IID_IUnknown> pSenderUK( pDisp);
    USES_CONVERSION;
    TRACE( _T( "Page downloading complete:\r\n"));
    CComBSTR bstrName;
    m_pWebBrowser2->get_LocationName(&bstrName);
    CComBSTR bstrURL;
    m_pWebBrowser2->get_LocationURL(&bstrURL);
    TRACE( _T( "Name:[ %s ]\r\nURL: [ %s ]\r\n"),
    OLE2T(bstrName),
    OLE2T(bstrURL));
    if (pWBUK== pSenderUK)
    {
        CComQIPtr<IDispatch> pHTMLDocDisp;
        m_pWebBrowser2->get_Document(&pHTMLDocDisp);
        CComQIPtr<IHTMLDocument2> pHTMLDoc(pHTMLDocDisp);
        CComQIPtr<IHTMLElementCollection> ecAll;
        CComPtr<IDispatch> pTagLineDisp;
        if(pHTMLDoc)
        {
            CComBSTR bstrNewTitle(_T("Sheng Jiang's Automation Test"));
            pHTMLDoc->put_title(bstrNewTitle);
            pHTMLDoc->get_all(&ecAll);
        }
        if(ecAll)
        {
            ecAll->item(COleVariant(_T("tagline")),
                            COleVariant((long)0),&pTagLineDisp);
        }
        CComQIPtr<IHTMLElement> eTagLine(pTagLineDisp);
        if(eTagLine)
        {
          eTagLine->put_innerText(
            CComBSTR(_T(
              "Command what is yours, conquer what is not. --Kane")));
        }
    }
}

Now the navigation takes place in the same window as IE.

By-product: connect to the current Windows Explorer window

While examining the shell windows list of ShellWindows object, I get a by-product: it seems that the Windows Explorer windows also have a common window class name. Thus the same mechanism works for Windows Explorer windows with a slight change of window class name from "IEFrame" to "ExploreWClass". Since there is no DHTML DOM to play with, I tell the Windows Explorer window to browse an existing path, to flag that I have taken over this window.

//show the folder bar
COleVariant clsIDFolderBar(
    _T("{EFA24E64-B078-11d0-89E4-00C04FC9E26E}"));
COleVariant FolderBarShow(VARIANT_TRUE,VT_BOOL);
COleVariant dummy;    
if(m_pWebBrowser2)    
    m_pWebBrowser2->ShowBrowserBar(
         &clsIDFolderBar,&FolderBarShow,&dummy);    
//browse to a given folder    
CComQIPtr<IServiceProvider> psp(m_pWebBrowser2);    
CComPtr<IShellBrowser> psb;     
if(psp)    
    psp->QueryService(SID_STopLevelBrowser,
                 IID_IShellBrowser,(LPVOID*)&psb);    
if(psb)    
{    
    USES_CONVERSION;    
    LPITEMIDLIST pidl=NULL;    
    SFGAOF sfgao;    
    SHParseDisplayName (T2OLE(m_strFileToFind),
                            NULL,&pidl,0, &sfgao);    
    if(pidl==NULL)    
        ::SHGetSpecialFolderLocation(m_hWnd,
                              CSIDL_DRIVES,&pidl);    
    m_pidlToNavigate=NULL;    
    if(pidl)    
    {    
        //if the start address is a folder, then browse it.   
        //otherwise browse to its parent folder, 
        //and select it in the folder view.   
        LPCITEMIDLIST pidlChild=NULL;    
        CComPtr<IShellFolder> psf;    
        HRESULT hr = SHBindToParent(pidl, 
                       IID_IShellFolder, 
                       (LPVOID*)&psf, &pidlChild);    
        if (SUCCEEDED(hr)){    
            SFGAOF rgfInOut=SFGAO_FOLDER;    
            hr=psf->GetAttributesOf(1,&pidlChild,&rgfInOut);    
            if (SUCCEEDED(hr)){    
                m_pidlToNavigate=ILClone(pidl);    
                if(rgfInOut&SFGAO_FOLDER){//this is a folder    
                    psb->BrowseObject(pidl,SBSP_SAMEBROWSER);     
                }    
                else    
                {    
                    //this is a file, browse to the parent folder    
                    LPITEMIDLIST pidlParent=ILClone(pidl);    
                    ::ILRemoveLastID(pidlParent);    
                    psb->BrowseObject( pidlParent, SBSP_SAMEBROWSER);    
                    ILFree(pidlParent);    
                }    
            }    
        }    
        //clean up    
        ILFree(pidl);    
    }    
}:

This code is a little wordy, because I want to take different actions for files and folders. If you call IShellBrowser::BrowseObject and pass a pidl of a file to the method, then Windows Explorer will ask you if you want to open the file or not, exactly the same as typing the file path in the address bar of a Window Explorer window and pressing Enter. I want to simulate the behavior of "Explorer.exe /select" that selects the file in the folder view, so I put some code in the DocumentComplete event handler:

if(m_pidlToNavigate)
{
    //If the start address is a file, browse to the parent folder
    //and then select it
    CComQIPtr<IServiceProvider> psp(m_pWebBrowser2);
    CComPtr<IShellBrowser> psb;
    CComPtr<IShellView> psv;
    if(psp)
        psp->QueryService(SID_STopLevelBrowser,
                     IID_IShellBrowser,(LPVOID*)&psb);
    if(psb)
        psb->QueryActiveShellView(&psv);
    if(psv)
    {
        LPCITEMIDLIST pidlChild=NULL;
        CComPtr<IShellFolder> psf;
        SFGAOF rgfInOut=SHCIDS_ALLFIELDS;
        HRESULT hr = SHBindToParent(m_pidlToNavigate, 
                         IID_IShellFolder, 
                         (LPVOID*)&psf, &pidlChild);
        if (SUCCEEDED(hr)){
            hr=psf->GetAttributesOf(1,&pidlChild,&rgfInOut);
            if (SUCCEEDED(hr)){
                if((rgfInOut&SFGAO_FOLDER)==0){
                    //a file, select it
                    hr=psv->SelectItem(ILFindLastID(m_pidlToNavigate)
                        ,SVSI_SELECT|SVSI_ENSUREVISIBLE|SVSI_FOCUSED|
                        SVSI_POSITIONITEM);
                }
            }
        }
    }
    //clean up
    ILFree(m_pidlToNavigate);
    m_pidlToNavigate=NULL;
}

A new Windows Explorer window

Let's take our new achievement back to the old problem. Since I can attach to the current Windows Explorer window almost the same way as attaching to the current Internet Explorer window, can I create and automate a new Windows Explorer window similar to the way I create and automate a new Internet Explorer window? To my surprise, the answer is no. There is no class ID for Windows Explorer to create such a COM object. Although I can still create an IE window, navigate to a folder, and show the Folder Explorer Bar that makes it look like a Windows Explorer window, I can not change the window class name "IEFrame", thus distinguishing it from the other IE windows displaying HTML pages and Active Documents is difficult.

OK, if I can not create it in the COM way, I can still try in the traditional way. I can create an explorer.exe process and look for its main window, as Paul DiLascia pointed out in his article "Get the Main Window, Get EXE Name", and send the undocumented message WM_GETISHELLBROWSER to get the IShellBrowser interface of the new window:

//start the new process
STARTUPINFO si;
PROCESS_INFORMATION pi;
ZeroMemory( &si, sizeof(si) );
si.cb = sizeof(si);
ZeroMemory( &pi, sizeof(pi) );
// Start the child process. 
if( !CreateProcess( NULL, // No module name (use command line). 
    _T("explorer.exe"), // Command line. 
    NULL,          // Process handle not inheritable. 
    NULL,          // Thread handle not inheritable. 
    FALSE,         // Set handle inheritance to FALSE. 
    0,             // No creation flags. 
    NULL,          // Use parent's environment block. 
    NULL,          // Use parent's starting directory. 
    &si,           // Pointer to STARTUPINFO structure.
    &pi )          // Pointer to PROCESS_INFORMATION structure.
)
//wait a graceful time 
//so the window is created and is ready to answer messages.
::WaitForInputIdle(pi.hProcess,1000);
//m_hExplorerProcess=(DWORD)pi.hProcess;
EnumWindows(EnumWindowsProc,(LPARAM)this);
BOOL CALLBACK CAutomationDlg::EnumWindowsProc(
                                 HWND hwnd,LPARAM lParam)
{
    CAutomationDlg* pdlg=(CAutomationDlg*)lParam;
    DWORD pidwin;
    GetWindowThreadProcessId(hwnd, &pidwin);
    if (pidwin==pdlg->m_hExplorerProcess)
    {
        IShellBrowser* psb=
          (IShellBrowser*)::SendMessage(hwnd,WM_USER+7,0,0);
        CComQIPtr<IWebBrowser2> pwb(psb);
        return FALSE;
    }
    return TRUE;
}

Oops, this does not catch the window on my computer either. What happened? In my folder option page of Windows Explorer, the "Open each folder in the same window" option is selected, so the new Windows Explorer window is created in an existing Windows Explorer process. Seems like a dead end.

Wait, I have another object in my hand, the ShellWindows object. It can give me a list of shell windows, including every Windows Explorer windows and the corresponding IWebBrowser2 interface, a door to its IShellBrowser interface. Now I need to get two shell window lists, one before creating an explorer.exe process and one right after that; then I must compare them to find out the new shell window.

m_pShellWindows.CoCreateInstance(CLSID_ShellWindows);
if(m_pShellWindows)
{
    //get the list of running IE windows
    //using the ShellWindows collection
    //For more information, please visit 
    //http://support.microsoft.com/kb/176792
    long lCount=0;
    m_pShellWindows->get_Count(&lCount);
    for(long i=0;i<lCount;i++)
    {
        CComPtr<IDispatch> pdispShellWindow;
        m_pShellWindows->Item(COleVariant(i),
                                 &pdispShellWindow);
        if(pdispShellWindow)
        {
            m_listShellWindows.AddTail(
                new CComQIPtrIDispatch(pdispShellWindow));
        }
    }
}
//enumerate through the new shell window list
long lCount=0;
m_pShellWindows->get_Count(&lCount);
for(long i=0;i<lCount;i++)
{
    //search the new window
    //using the ShellWindows collection
    //For more information, please visit 
    //http://support.microsoft.com/kb/176792
    BOOL bFound=FALSE;
    CComPtr<IDispatch> pdispShellWindow;
    m_pShellWindows->Item(COleVariant(i),
                           &pdispShellWindow);
    //search it in the old shell window list
    POSITION pos=m_listShellWindows.GetHeadPosition();
    while(pos)
    {
        CComQIPtrIDispatch* pDispatch=
                      m_listShellWindows.GetNext(pos);
        if(pDispatch&&pdispShellWindow.p==pDispatch->p)
        {
            bFound=TRUE;break;    
        }
    }
    if(!bFound)//new window found
    {
        //attach to it
        m_pWebBrowser2=pdispShellWindow;
        m_bOwnIE=TRUE;
        //sink for the Quit and DocumentComplete events
        AdviseSinkIE();
        NavigateToSamplePage(FALSE);
    }
}

Wait a second. What do I mean by "right after creating explorer.exe process"? One second after calling the CreateProcess function? Or maybe two? In fact, a WindowRegistered event is fired by the ShellWindows object after each shell window is created, and I put my comparison in its event handler.

//sink DShellWindowsEvents events
LPUNKNOWN pUnkSink = GetIDispatch(FALSE);
m_pShellWindows.CoCreateInstance(CLSID_ShellWindows);
AfxConnectionAdvise((LPUNKNOWN)m_pShellWindows, 
              DIID_DShellWindowsEvents,pUnkSink,
              FALSE,&m_dwCookieShellWindows);
void CAutomationDlg::WindowRegistered(long lCookie) 
{
    //ok, a new shell window is created
    if(m_pShellWindows)
    {
        //enumerate through the new shell window list
        long lCount=0;
        m_pShellWindows->get_Count(&lCount);
        for(long i=0;i<lCount;i++)
        {
            //search the new window
            //using the ShellWindows collection
            //For more information, please visit 
            //http://support.microsoft.com/kb/176792
            BOOL bFound=FALSE;
            CComPtr<IDispatch> pdispShellWindow;
            m_pShellWindows->Item(COleVariant(i),
                                       &pdispShellWindow);
            //search it in the old shell window list
            POSITION pos=m_listShellWindows.GetHeadPosition();
            while(pos)
            {
                CComQIPtrIDispatch* pDispatch=
                              m_listShellWindows.GetNext(pos);
                if(pDispatch&&pdispShellWindow.p==pDispatch->p)
                {
                    bFound=TRUE;break;    
                }
            }
            if(!bFound)//new window 
            {
                //attach to it
                m_pWebBrowser2=pdispShellWindow;
                m_bOwnIE=TRUE;
                //sink for the Quit and DocumentComplete events
                AdviseSinkIE();
                NavigateToSamplePage(FALSE);
            }
        }
        //clean up
        if(m_dwCookieShellWindows!= 0)
        {
            LPUNKNOWN pUnkSink = GetIDispatch(FALSE);
            AfxConnectionUnadvise((LPUNKNOWN)m_pShellWindows, 
                            DIID_DShellWindowsEvents, pUnkSink, 
                            FALSE, m_dwCookieShellWindows);
            m_dwCookieShellWindows= 0;
        }
        POSITION pos=m_listShellWindows.GetHeadPosition();
        while(pos)
        {
            CComQIPtrIDispatch* pDispatch=
                          m_listShellWindows.GetNext(pos);
            delete    pDispatch;
        }
        m_listShellWindows.RemoveAll();
        m_pShellWindows=(LPUNKNOWN)NULL;
    }
}

Why not Browser Helper Objects?

Since the new window is not in my process, the interprocess marshalling penalty of COM calls is high. If your automation operations consist of too many COM calls, you may need to make your code in-process, such as writing a Browser Helper Objects (BHO). However, BHO will be loaded by all instances of both Windows Explorer and Internet Explorer, and I don't want to slow down the whole system just to clean up my mess. Some people have actually used this technology to connect to the current instance of Internet Explorer.

Known issues

The ShellWindows object is not unavailable if the default explorer.exe process is killed or is not launched. BHO can be an alternative in such cases.

Conclusion

There is a large chunk of head-scratching code here. In addition, it takes some time getting used to the mixed COM and Windows API function calls. Hopefully, you would find this article useful, and not be confused with my code. Automating Internet Explorer and Windows Explorers windows can save you a lot of time, since you are able to avoid simulating the default behaviors of the system, and it provides a familiar look to the end users.

Reference

Compile instructions for VC6 users

A new version of Microsoft Platform SDK is required for the new shell APIs used in the source code. It is available here. Some of these shell APIs can be replaced by some functions listed here.

History

  • October 20th, 2005 - Initial release.

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)