|
I know this question may be out of the scope of this article, but I'll ask it anyway. I have scanned for all elements in an HTML document. I then automatically handle all relevant events for input type controls (text boxes, radio and checkbox buttons, pulldown option lists, etc.). Events are on-click, on-keypress, on-select. Now I want to get events for an ActiveX control (an HTML object). Does anyone know how to do this?
|
|
|
|
|
I didn't do this, but you have an IHTMLObjectElement interface and as I got from MSDN, you can get an IDispatch of nested ActiveX by calling get_object(IDispatch**p) function. So u can gain access to that ActiveX.
If you'll find a better or right way, please post here, I'd glad to know too
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Here is the code that I use to register event handling for any <object> tag in an HTML document (ActiveX controls):
HRESULT hr = m_pIE->get_Document(&pDisp);
IHTMLDocument3* pDoc;
IHTMLElement* pElem = NULL;
IHTMLElementCollection* pColl;
hr = pDisp->QueryInterface(IID_IHTMLDocument3,(void**)&pDoc);
BSTR str = ::SysAllocString(L"OBJECT");
hr = pDoc->getElementsByTagName(str, &pColl);
long len;
hr = pColl->get_length(&len);
for (long x = 0; x < len; x++)
{
COleVariant index(x);
COleVariant index2((long)0);
IDispatch* spDispatch;
hr = pColl->item(index, index2, &spDispatch);
IHTMLObjectElement* spTempObjEl;
hr = spDispatch->QueryInterface(IID_IHTMLObjectElement,(void**)&spTempObjEl);
spDispatch->Release();
spTempObjEl->get_object(&spDispatch);
LPUNKNOWN pUnkSink = m_pEvent->GetIDispatch(TRUE);
BOOL bAdvised = AfxConnectionAdvise(spDispatch,DIID__ISliderEvents,pUnkSink, FALSE, &m_cookie);
}
In the code, I assume that all <object> tags represent Slider ActiveX controls (there are lots of others like UpDown, etc.). What I don't understand is how to declare the code to trap certain events (on_mousedrag, on_mouseclick, etc.) to callbacks that will figure out which of the many slider controls in the document is generating the event. Any ideas?
Also, how would you rewrite the code above using smart pointers?
Thanks,
|
|
|
|
|
... in theory, 'cause never did this before.
The OBJECT element is also a regular HTML element, so u can get the interface to IHTMLElement by calling QueryInterface . And the IHTMLElement has methods like put_onclick() , where you can pass a VARIANT with your IDispatch inside.
Another way is to get IHTMLElement2 and use its attachEvent(BSTR eventName, IDispatch*, VARIANT_BOOL* pResult) . Looks to me same, just another way.
Now how to determine who actually fired the event. I guess this way:
When your function is called by some event, you can get an IHTMLEventObj interface, from IHTMLWindow2 (get_event() function).
Then, this Event object has a function get_srcElement(IHTMLElemenr**p) , call it and you'll have your element and can obtain any of its attributes
TBiker wrote:
Also, how would you rewrite the code above using smart pointers?
See this post, I talked there about smart pointers. And your code will look like:
MSHTML::IHTMLDocument3Ptr pDoc;
HRESULT hr = m_pIE->get_Document(&pDoc);
MSHTML::IHTMLElementPtr pElem = NULL;
MSHTML::IHTMLObjectElementPtr spTempObjEl;
MSHTML::IHTMLElementCollectionPtr pColl = pDoc->getElementsByTagName(L"OBJECT");
IDispatchPtr pDisp;
for (long x = 0; x < pColl->length; x++)
{
spTempObjEl = pColl->item(x, (long)0);
spTempObjEl->get_object(&pDisp);
<font color=green>
pElem = spTempObjEl;
<font color=green>
LPUNKNOWN pUnkSink = m_pEvent->GetIDispatch(TRUE);
BOOL bAdvised = AfxConnectionAdvise(pDisp,DIID__ISliderEvents,pUnkSink, FALSE, &m_cookie);
}
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Great suggestion! I implemented what you suggested and I found out that its not enough just to call attachEvent. This call (or put_onxxxxx) registers a dispatch pointer for a particular event but it does not enable the event. The way in which events are enabled is by adding this code:
HRESULT hr;
IConnectionPointContainer* pCPC = NULL;
IConnectionPoint* pCP = NULL;
DWORD dwCookie;
// Check that this is a connectable object.
hr = pElem->QueryInterface(IID_IConnectionPointContainer, (void**)&pCPC);
// Find the connection point.
hr = pCPC->FindConnectionPoint(DIID_HTMLElementEvents2, &pCP);
// Advise the connection point.
// pUnk is the dispatch pointer you used in attachEvent
hr = pCP->Advise(pUnk, &dwCookie);
When you are finished with events, disable the events by using a call to pCP->Unadvise(dwCookie);
Thanks for you help! I will wrap this code up and submit it for others to use.
|
|
|
|
|
I know it only in theory, I mean only by reading MSDN and stuff, but will be glad to implement it sometimes
TBiker wrote:
I will wrap this code up and submit it for others to use.
If you'll remember about me when you'll submit your article, please post about it here, so I won't pass it. I'd be glad to see it working and to use it too
lol
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Well, not so great...
After testing some more, it turns out the IHTMLElement2 interface does not control ALL elements as Microsoft may claim (or this is a real possibility that it does but I haven't a clue how its done). The problem is that each element has its own set of connection interfaces (IDispatch, HTMLElementEvents2, HTMLInputElementEvents2, etc.) and HTMLElementEvents2 is not always available for certain element types. I discussed this with Microsoft (yep, used up one of my precious support calls) and they don't seem too knowledgeable either. They suggest creating separate event sinks for each ActiveX control. Problem with this is managing a huge amount of event class instances and determining which event belongs to which element. So I'm still researching this problem. Any ideas would be appreciated.
|
|
|
|
|
Wow! You spent one of those 4-per-year calls to Microsoft? Cool
Well, yeah, I heard about it, about separate class/instances, etc. for every element. Can't try this by myself, I have a big project on my neck right now, LMAO
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
I wrote the same code, and ran under two different version of IE, eg ie5.01P2 and ie6.02, anything is okay under v5.01, but fail under 6.02, I don't know why?
My code is as the following:
MSHTML::IHTMLElementCollection * pAllLink=NULL;
pDoc->get_links(&pAllLink);
if (pAllLink!=NULL)
{
LONG lLinkLen;
pAllLink->get_length(&lLinkLen);
VARIANT varIndex;
varIndex.vt = VT_UINT;
IDispatch* pDisp;
IHTMLElement *pLink;
BSTR bstrLinkAddress,bstrLinkTitle;
int iAddType=-1;
for (int i=0; i<lLinkLen;i++)
{
iAddType=-1;
varIndex.lVal = i;
pDisp = pAllLink->item(varIndex,(long)0);
pDisp->QueryInterface(IID_IHTMLElement,(void **)&pLink);
pLink->toString(&bstrLinkAddress);
CString cstrLinkAddress(bstrLinkAddress);
CString cstrTempLinkAddress;
cstrTempLinkAddress = m_cstrURL + CString("#");
if ((cstrLinkAddress.CompareNoCase(cstrTempLinkAddress) == 0) || (cstrLinkAddress.CompareNoCase(CString("about:blank#")) == 0))
{
VARIANT pLinkVariant;
pLink->get_onclick(&pLinkVariant);
if (pLinkVariant.vt != VT_NULL)
{
bstrLinkAddress = pLinkVariant.bstrVal;
cstrLinkAddress = CString(bstrLinkAddress);
iAddType = 0;
}
else
{
iAddType = 6;
}
}
if (iAddType == -1)
{
ParseURL(cstrLinkAddress,&iAddType);
}
if (iAddType == 4) //Text page and same directorys only...
{
CObject *cObj;
if (g_LinkList.Lookup(LPCTSTR(cstrLinkAddress),(CObject *&)cObj)==0)
{
g_LinkList.SetAt(LPCTSTR(cstrLinkAddress),NULL);
pLink->get_innerText(&bstrLinkTitle);
m_link.csaAddress.Add(cstrLinkAddress);
m_link.csaTitle.Add(CString(bstrLinkTitle));
m_link.cbaAddressType.Add((BYTE)iAddType);
}
}
SysFreeString(bstrLinkTitle);
SysFreeString(bstrLinkAddress);
}
pAllLink->Release();
}
it throws an exception that the memory cann't be accessed...
Can anyone help me, thanks a lot....
|
|
|
|
|
Which line are u getting an exception?
Also u have some leaks. If you are not using smart pointers u have to release all interfaces by yourself.
I suggest to use smart pointers, this will simplify the code also.
Also, as I see you want to get links in document. According to MSDN, the function get_links() will return you only links that HAVE name and/or id.
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Oh, thanks for your reply... First, I'm not sure which line I get an exception, I try to debug line by line, sometimes it throws an exception on line "SysFreeString", and sometimes it throws an exception on my function called "ParseURL", but it never happens if I try my function in the way of "ParseString" which is shown on the website:http://www.codeguru.com/ieprogram/HTMLParsing.html, but in that way it causes memory leak.
And second, I don't know how to use smart pointers, I try to release any interfaces by myself, if it throws an exception by release function, it shows that 's a smart pointers, right? And if I try MSHTML::IHTMLElementCollection2Ptr, I don't know how to call its function get_links, so I have to use MSHTML::IHTMLElementCollection, can you give me more good suggestions,please??
And the last,as you see, I want to get links in document,if get_links can only return links that have name and/or id, how can I get others? Do you know?
Oh, yes, there's one more important question I want to ask you. When I try to parse html in the ParseString way as being shown on http://www.codeguru.com/ieprogram/HTMLParsing.html, it never open new ie windows, but when I try the way you prefer, and in a multithread subroutine, it news more and more ie windows, I have to close them one by one by click my mouse, I cann't stand, can you also solve this problem? Thanks for you help.....
|
|
|
|
|
Well. About smart pointers in MSHTML.
When you use smart pointer, it will manage memory and interface releasing by itself. Also as a bonus you will get a QueryInterface function nested inside smart pointer. So the code that u have:
IDispatch* pDisp;
IHTMLElement *pLink;
pDisp = pAllLink->item(varIndex,(long)0);
pDisp->QueryInterface(IID_IHTMLElement,(void **)&pLink);
will look like this:
MSHTML::IHTMLElementPtr pLink;
pLink = pAllLink->item(varIndex, (long)0);
What is going on here? Well, item() function still returns IDispatch , but when in code you are trying to assign it to smart pointer (pLink here), it calls to QueryInterface() internally and obtains a pointer to this interface. So if you see that pLink is not NULL, that says that QueryInterface() succeeded.
Also smart pointer is a simple class (that wraps an interface), thus, when it goes out of scope (end of function here), it calls to Release() in its destructor (do not do it by yourself, if you want to release an interface in smart pointer, just assign NULL to it - pLink = NULL )
This is about smart pointers. I personally don't like long messages, so I'll continue in next one
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Well done. With your help, I change all of my source code to smart pointers now, they're much thinner than before, thanks a lot, and I'm waiting for your further reply, after that, I will take a full test for all...
|
|
|
|
|
Ok, about functions.
As I remember, all MSHTML smart pointers has same function names as raw interfaces, and in addition their own functions, which I suggest to use (but u don't have to of course). The naming convention for such fuctions usually like this:
If you have a function get_links() , the smart pointer's one will be called Getlinks() and so on. I'm not quite sure it is documented in MSDN, I used VisualAssist to get their names
Also in smart pointers all properties is already declared (as bstr_t , so u don't have to use toString() method or any other else. In case of link, u can freely get its href attribute like this:
CString cstrLinkAddress = (LPCTSTR)pLink->href
First, this class - bstr_t - is very useful, it is like smart pointer, releases memory used by itself, u don't have to use SysFreeString() , and second, it has its own conversion routine, from ANSI to UNICODE and vise versa, just remember to apply (LPCTSTR) casting when u want to assign it to CString
Now about links.
If you are working with IE5+, it will be quite easy to get links. Convert your MSHTML::IHTMLDocument2Ptr to MSHTML::IHTMLDocument3Ptr and you'll have a getElementsByTagName() function, which returns MSHTML::IHTMLElementsCollectionPtr of specified tag. The code will look like this (pDoc2 is IHTMLDocument2Ptr here):
CString csLinkHref;
MSHTML::IHTMLAnchorElementPtr pAnchor;
MSHTML::IHTMLDocument3Ptr pDoc3 = pDoc2;
MSHTML::IHTMLElementsCollectionPtr pCollection = pDoc3->getElementsByTagName(L"A");
for(long i=0; i<pCollection->length; i++){
pAnchor = pCollection->item(i, (long)0);
csLinkHref = (LPCTSTR)pAnchor->href;
<font color=green>
}
Look in the demo for this article. What I'm doing there is getting the list of links, I guess this is exactly what you want
-----------
Now about new windows. This is depends on how you are opening the document.. Post the code where you are getting the pointer to pDoc and also where you are writing to it in the way I used in article.
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Thanks again, I read what you replied twice, and I understood what you said. Indeed, I had tried to use MSHTML::IHTMLAnchorElementPtr to get all links, as you shown in your demo project, but there're two problems I cann't solve in that way: first, I want to get the inner text of each links as I shown in the above message or this:
pLink->get_innerText(&bstrLinkTitle),
in your way, I'm afraid I cann't get it;
Second, there're so many links looked like this:
<a href="#" onclick="javaScript:location.href=...">...</a>
And in my raw way, I can get it by using get_onclick function, but there's no get_onclick function if I use MSHTML::IHTMLAnchorElementPtr.
And about the code where I'm getting the pointer to pDoc, I will show you in the next message, thank you...
|
|
|
|
|
Hmm, well so u can use MSHTML::IHTMLElementPtr instead of Anchor. Just as u do, but instead of raw interface - smart pointer. Remember that smart pointers have same functions as interfaces + more. So if Interface has get_onclick() , so smart pointer will have it too.
You can combine both Anchor and Element. Simple do Anchor = Element or vise versa and you'll have both interfaces So you won't have to parse the tag manually.
Of course, this is your choice, I just hate to parse strings by myself, that's why I so like MSHTML
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Thank you, I will try it tomorrow, and if still any question, I will post them all here, thanks a lot...
|
|
|
|
|
我也遇到了同样的问题.
MSHTML::IHTMLAnchorElementPtr pAnchor;
...
CString strTitle = pAnchor->name;
ASSERT(strTitle != _T(""));
...
can't get link title. Why?
我想不通是怎么一回事.
那位大哥来讲一讲.
thanks.
|
|
|
|
|
You are trying to get a name attribute, not title. To get a title you should use MSHTML::IHTMLElementPtr
BTW, why I see chineese? lol
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
thank you, philip patrich.
I hope get IHTMLAnchorElementPtr->name.
I am Chinese.My English is pool. Sorry.
其实我想用ie sdk做页面链接分解,可惜太难了.
|
|
|
|
|
You can try the following:
MSHTML::IHTMLAnchorElementPtr pAnchor;
_bstr_t bstrName = pAnchor->Getname();
This can help you get its name, or if you want to get its innertext, you can try:
MSHTML::IHTMLElementPtr pElem;
pElem = pAnchor;
_bstr_t bstrInnerText = pElem->GetinnerText();
//............
|
|
|
|
|
Let me continue to show you my code. First, I want to say, I had try my program before, by getting the same website addresses, it never open new ie windows, but in that way, I didn't use smart pointers, as I shown you in the above messages, and either, I didn't include the following files:
#include <comdef.h>
#include <mshtml.h>
#pragma warning(disable : 4146) //see Q231931 for explaintation
#import <mshtml.tlb> no_auto_exclude
,but the file <mshtml.h> instead.
And all the download codes are the same one, first I call InternetOpen to create a connect session, then I call AfxParseURLEx to parse and check the URL which I want to download, and later, I call InternetConnect to create a connect to that URL, and I call HttpOpenRequest and HttpSendRequest if every function runs okay, finally, I call HttpQueryInfo to check if the return status code is equal to 200, if it does, I call InternetReadFile to get its content and save it to a variable, and close all open variable so that it won't cause any memory leak. The above step is okay when I run it without any smart pointers and only include file <mshtml.h>, so I'm afraid it shouldn't open any new ie windows in the way you prefer.
And the second step, let's assume that I download one URL to a variable called m_cstrContent,and its variable type is CString. And I parse this content as the following:
MSHTML::IHTMLDocument2Ptr pDoc;
hr = CoCreateInstance(CLSID_HTMLDocument, NULL, CLSCTX_INPROC_SERVER,
IID_IHTMLDocument2, (void**)&pDoc);
if (FAILED(hr))
{
return iRet;
}
SAFEARRAY* psa = SafeArrayCreateVector(VT_VARIANT, 0, 1);
VARIANT *param;
bstr_t bsData = (LPCTSTR)m_cstrContent;
hr = SafeArrayAccessData(psa, (LPVOID*)¶m);
param->vt = VT_BSTR;
param->bstrVal = (BSTR)bsData;
hr = pDoc->write(psa);
if (FAILED(hr))
{
return iRet;
}
hr = pDoc->close();
if (FAILED(hr))
{
return iRet;
}
SafeArrayDestroy(psa);
iRet = SUCCESS;
//get title
BSTR bstrTitle;
pDoc->get_title(&bstrTitle);
m_cstrTitle = CString(bstrTitle);
SysFreeString(bstrTitle);
//get content text
MSHTML::IHTMLElementCollectionPtr pAll;
pDoc->get_all(&pAll);
if (pAll!=NULL)
{
VARIANT varIndexAll;
varIndexAll.vt = VT_UINT;
varIndexAll.lVal = 0;
MSHTML::IHTMLElementPtr pElemText;
pElemText = pAll->item(varIndexAll,(long)0);
BSTR bstrContentText;
pElemText->get_outerText(&bstrContentText);
m_cstrText = CString(bstrContentText);
SysFreeString(bstrContentText);
}
//get links
MSHTML::IHTMLElementCollectionPtr pAllLink;
pDoc->get_links(&pAllLink);
if (pAllLink!=NULL)
{
LONG lLinkLen;
pAllLink->get_length(&lLinkLen);
VARIANT varIndex;
varIndex.vt = VT_UINT;
MSHTML::IHTMLElementPtr pLink;
BSTR bstrLinkAddress,bstrLinkTitle;
int iAddType=-1;
for (int i=0; i<lLinkLen;i++)
{
iAddType=-1;
varIndex.lVal = i;
pLink = pAllLink->item(varIndex,(long)0);
bstrLinkAddress = pLink->toString();
CString cstrLinkAddress(bstrLinkAddress);
CString cstrTempLinkAddress;
cstrTempLinkAddress = m_cstrURL + CString("#");
if ((cstrLinkAddress.CompareNoCase(cstrTempLinkAddress) == 0) || (cstrLinkAddress.CompareNoCase(CString("about:blank#")) == 0))
{
VARIANT pLinkVariant;
pLink->get_onclick(&pLinkVariant);
if (pLinkVariant.vt != VT_NULL)
{
bstrLinkAddress = pLinkVariant.bstrVal;
cstrLinkAddress = CString(bstrLinkAddress);
iAddType = 0;
}
else
{
iAddType = 6;
}
}
if (iAddType == -1)
{
ParseURL(cstrLinkAddress,&iAddType);
}
if (iAddType == 4) //Text page and same directorys only...
{
CObject *cObj;
WaitForSingleObject(ghLinkEvent,INFINITE);
if (g_LinkList.Lookup(LPCTSTR(cstrLinkAddress),(CObject *&)cObj)==0)
{
g_LinkList.SetAt(LPCTSTR(cstrLinkAddress),NULL);
pLink->get_innerText(&bstrLinkTitle);
m_link.csaAddress.Add(cstrLinkAddress);
m_link.csaTitle.Add(CString(bstrLinkTitle));
m_link.cbaAddressType.Add((BYTE)iAddType);
}
SetEvent(ghLinkEvent);
}
SysFreeString(bstrLinkTitle);
SysFreeString(bstrLinkAddress);
}
}
And with your help, I will accept your good idea and change some of my code such as using class bstr_t and so on, and I need some time to debug. It's too late here, I'm afraid I can only do that tomorrow.
And so far, the most important thing is disabling every new ie windows when I downloading or parsing any pages, thanks....
|
|
|
|
|
Great discovery. I have failed to do this 3 month ago.
|
|
|
|
|
I created a demo project, which is a simple Dialog application. All it is doing is taking HTML file and displays all links in ListBox.
Remember that you have to have Platform SDK in order to compile the demo
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
If you have any questions, maybe found a bug.. just let me know
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|