|
Well, IMG has no "href" attribute, instead it has "src"
and SPAN tag neither href and src, it is text formatting tag
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
About the second question, in fact, you can try GetoffsetLeft, GetoffsetTop, GetoffsetHeight, GetoffsetWidth to get its rectangle, I'm not sure, but you can just try it...
|
|
|
|
|
I try it and I get the real width and height, but the left and top refer to the window that around the element.
Thank you
Chagit
|
|
|
|
|
Oh, then you must try MSHTML::IHTMLElement2Ptr, and its GetclientHeight functions and so on maybe can help you, I'm not sure about it, you can just try....
|
|
|
|
|
About my source code, you can link to
http://www.codeproject.com/useritems/parse_html.asp?df=100&forumid=3219&select=108390#xx108390xx
, and the difference is I have changed all pointers which is not smart to smart pointers, but when I parse the content of the URL :
http://www.163.net
, it pops up an error warning, and hints that there's a run time error occurs, and I don't want to pop up any message box or ie windows during all the run time, what should I do? Thanks for your help...
What's more, this run time error occurs on the very line :
hr = pDoc->write(psa)
Thanks again...
supplement:
When I try the demo project with the source code of this web page: http://www.163.net, it pops up such an error box,too, and in addiction, I embed pDoc->write function into an exception clause as:
try
{
hr = pDoc->write(psa)
}
catch(...)
{
//...........
}
I can only catch nothing, why why why???
|
|
|
|
|
I'm longing for your reply, thanks a lot...
|
|
|
|
|
Yep, one sec, I'm building an app just like yours to debug it by myself
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Ok, I see what the problem... and I see what is going on. The problem not the smart pointers of course, but the different way the Document does its job. When u use IMarkupServices (like in the article u mentioned), it just parses the given HTML code, but not processes it, but write() actually executes all script inside it, while parsing HTML.
I'll figure out now how to elliminate this and post it here.
BTW, you can use smart pointers with Markup Services too. The only reason I made the article is the "BODY" tag bug, which I needed in my application...
Well... I'm working, lol
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Yes, as you said above, there're so many document.write script on that page, that's why it pops up error warning. But I'm afraid I cann't use parseString and smart pointers together, for smart pointers can not be ran under ie v6.02, and ParseString leads to memory leak under ie v5.01p2, that's really a big bug.
I'm waiting for your solution now, thank you...
|
|
|
|
|
What's more, I found the reason of new ie windows, it's almost the same reason with the run time error message box, in fact, when we call pDoc->write, all scripts are executed, and some of them leads to open new ie windows, I don't know whether we can solve this problem or not, can you give me much more suggestion? Thank you...
|
|
|
|
|
So far I found the only way to elliminate popups is to replace "window.open" in all HTML by something like "javascript.void" (or whatever you want). I'm doing this just before pDoc->write() , using Replace() function of CString
This is a weird way though, since "window" object is default one and can be skipped in Javascript
Still looking for better way
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Wow, I have tried this way before. But it's not a good way, I see it. There're so many scripts to open new ie windows, you know, window.open is one, and document.location.href(target = _blank) is one, and <META HTTP-EQUIV="Refresh" CONTENT="[secs];URL=[RefreshUrl]"> and lots of other ways to new ie windows, I cann't list them all.
Secondly, some of scripts depends on other documents such as .js or its parent document, these scripts can not be executed successfully, I'm afraid it will pop up more and more message box, and if I run my program in multithread mode, it will drive me mad....
By and large, we cann't elliminate popups by replacing strings in all HTML, and I'm afraid there's another better way to solve this problem, microsoft can do it, maybe...
Anyway, thanks for your help.... I would like to find another solutions, maybe we can try it together. If you have much better idea, please let me know, thank you...
|
|
|
|
|
|
I know this question may be out of the scope of this article, but I'll ask it anyway. I have scanned for all elements in an HTML document. I then automatically handle all relevant events for input type controls (text boxes, radio and checkbox buttons, pulldown option lists, etc.). Events are on-click, on-keypress, on-select. Now I want to get events for an ActiveX control (an HTML object). Does anyone know how to do this?
|
|
|
|
|
I didn't do this, but you have an IHTMLObjectElement interface and as I got from MSDN, you can get an IDispatch of nested ActiveX by calling get_object(IDispatch**p) function. So u can gain access to that ActiveX.
If you'll find a better or right way, please post here, I'd glad to know too
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Here is the code that I use to register event handling for any <object> tag in an HTML document (ActiveX controls):
HRESULT hr = m_pIE->get_Document(&pDisp);
IHTMLDocument3* pDoc;
IHTMLElement* pElem = NULL;
IHTMLElementCollection* pColl;
hr = pDisp->QueryInterface(IID_IHTMLDocument3,(void**)&pDoc);
BSTR str = ::SysAllocString(L"OBJECT");
hr = pDoc->getElementsByTagName(str, &pColl);
long len;
hr = pColl->get_length(&len);
for (long x = 0; x < len; x++)
{
COleVariant index(x);
COleVariant index2((long)0);
IDispatch* spDispatch;
hr = pColl->item(index, index2, &spDispatch);
IHTMLObjectElement* spTempObjEl;
hr = spDispatch->QueryInterface(IID_IHTMLObjectElement,(void**)&spTempObjEl);
spDispatch->Release();
spTempObjEl->get_object(&spDispatch);
LPUNKNOWN pUnkSink = m_pEvent->GetIDispatch(TRUE);
BOOL bAdvised = AfxConnectionAdvise(spDispatch,DIID__ISliderEvents,pUnkSink, FALSE, &m_cookie);
}
In the code, I assume that all <object> tags represent Slider ActiveX controls (there are lots of others like UpDown, etc.). What I don't understand is how to declare the code to trap certain events (on_mousedrag, on_mouseclick, etc.) to callbacks that will figure out which of the many slider controls in the document is generating the event. Any ideas?
Also, how would you rewrite the code above using smart pointers?
Thanks,
|
|
|
|
|
... in theory, 'cause never did this before.
The OBJECT element is also a regular HTML element, so u can get the interface to IHTMLElement by calling QueryInterface . And the IHTMLElement has methods like put_onclick() , where you can pass a VARIANT with your IDispatch inside.
Another way is to get IHTMLElement2 and use its attachEvent(BSTR eventName, IDispatch*, VARIANT_BOOL* pResult) . Looks to me same, just another way.
Now how to determine who actually fired the event. I guess this way:
When your function is called by some event, you can get an IHTMLEventObj interface, from IHTMLWindow2 (get_event() function).
Then, this Event object has a function get_srcElement(IHTMLElemenr**p) , call it and you'll have your element and can obtain any of its attributes
TBiker wrote:
Also, how would you rewrite the code above using smart pointers?
See this post, I talked there about smart pointers. And your code will look like:
MSHTML::IHTMLDocument3Ptr pDoc;
HRESULT hr = m_pIE->get_Document(&pDoc);
MSHTML::IHTMLElementPtr pElem = NULL;
MSHTML::IHTMLObjectElementPtr spTempObjEl;
MSHTML::IHTMLElementCollectionPtr pColl = pDoc->getElementsByTagName(L"OBJECT");
IDispatchPtr pDisp;
for (long x = 0; x < pColl->length; x++)
{
spTempObjEl = pColl->item(x, (long)0);
spTempObjEl->get_object(&pDisp);
<font color=green>
pElem = spTempObjEl;
<font color=green>
LPUNKNOWN pUnkSink = m_pEvent->GetIDispatch(TRUE);
BOOL bAdvised = AfxConnectionAdvise(pDisp,DIID__ISliderEvents,pUnkSink, FALSE, &m_cookie);
}
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Great suggestion! I implemented what you suggested and I found out that its not enough just to call attachEvent. This call (or put_onxxxxx) registers a dispatch pointer for a particular event but it does not enable the event. The way in which events are enabled is by adding this code:
HRESULT hr;
IConnectionPointContainer* pCPC = NULL;
IConnectionPoint* pCP = NULL;
DWORD dwCookie;
// Check that this is a connectable object.
hr = pElem->QueryInterface(IID_IConnectionPointContainer, (void**)&pCPC);
// Find the connection point.
hr = pCPC->FindConnectionPoint(DIID_HTMLElementEvents2, &pCP);
// Advise the connection point.
// pUnk is the dispatch pointer you used in attachEvent
hr = pCP->Advise(pUnk, &dwCookie);
When you are finished with events, disable the events by using a call to pCP->Unadvise(dwCookie);
Thanks for you help! I will wrap this code up and submit it for others to use.
|
|
|
|
|
I know it only in theory, I mean only by reading MSDN and stuff, but will be glad to implement it sometimes
TBiker wrote:
I will wrap this code up and submit it for others to use.
If you'll remember about me when you'll submit your article, please post about it here, so I won't pass it. I'd be glad to see it working and to use it too
lol
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Well, not so great...
After testing some more, it turns out the IHTMLElement2 interface does not control ALL elements as Microsoft may claim (or this is a real possibility that it does but I haven't a clue how its done). The problem is that each element has its own set of connection interfaces (IDispatch, HTMLElementEvents2, HTMLInputElementEvents2, etc.) and HTMLElementEvents2 is not always available for certain element types. I discussed this with Microsoft (yep, used up one of my precious support calls) and they don't seem too knowledgeable either. They suggest creating separate event sinks for each ActiveX control. Problem with this is managing a huge amount of event class instances and determining which event belongs to which element. So I'm still researching this problem. Any ideas would be appreciated.
|
|
|
|
|
Wow! You spent one of those 4-per-year calls to Microsoft? Cool
Well, yeah, I heard about it, about separate class/instances, etc. for every element. Can't try this by myself, I have a big project on my neck right now, LMAO
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
I wrote the same code, and ran under two different version of IE, eg ie5.01P2 and ie6.02, anything is okay under v5.01, but fail under 6.02, I don't know why?
My code is as the following:
MSHTML::IHTMLElementCollection * pAllLink=NULL;
pDoc->get_links(&pAllLink);
if (pAllLink!=NULL)
{
LONG lLinkLen;
pAllLink->get_length(&lLinkLen);
VARIANT varIndex;
varIndex.vt = VT_UINT;
IDispatch* pDisp;
IHTMLElement *pLink;
BSTR bstrLinkAddress,bstrLinkTitle;
int iAddType=-1;
for (int i=0; i<lLinkLen;i++)
{
iAddType=-1;
varIndex.lVal = i;
pDisp = pAllLink->item(varIndex,(long)0);
pDisp->QueryInterface(IID_IHTMLElement,(void **)&pLink);
pLink->toString(&bstrLinkAddress);
CString cstrLinkAddress(bstrLinkAddress);
CString cstrTempLinkAddress;
cstrTempLinkAddress = m_cstrURL + CString("#");
if ((cstrLinkAddress.CompareNoCase(cstrTempLinkAddress) == 0) || (cstrLinkAddress.CompareNoCase(CString("about:blank#")) == 0))
{
VARIANT pLinkVariant;
pLink->get_onclick(&pLinkVariant);
if (pLinkVariant.vt != VT_NULL)
{
bstrLinkAddress = pLinkVariant.bstrVal;
cstrLinkAddress = CString(bstrLinkAddress);
iAddType = 0;
}
else
{
iAddType = 6;
}
}
if (iAddType == -1)
{
ParseURL(cstrLinkAddress,&iAddType);
}
if (iAddType == 4) //Text page and same directorys only...
{
CObject *cObj;
if (g_LinkList.Lookup(LPCTSTR(cstrLinkAddress),(CObject *&)cObj)==0)
{
g_LinkList.SetAt(LPCTSTR(cstrLinkAddress),NULL);
pLink->get_innerText(&bstrLinkTitle);
m_link.csaAddress.Add(cstrLinkAddress);
m_link.csaTitle.Add(CString(bstrLinkTitle));
m_link.cbaAddressType.Add((BYTE)iAddType);
}
}
SysFreeString(bstrLinkTitle);
SysFreeString(bstrLinkAddress);
}
pAllLink->Release();
}
it throws an exception that the memory cann't be accessed...
Can anyone help me, thanks a lot....
|
|
|
|
|
Which line are u getting an exception?
Also u have some leaks. If you are not using smart pointers u have to release all interfaces by yourself.
I suggest to use smart pointers, this will simplify the code also.
Also, as I see you want to get links in document. According to MSDN, the function get_links() will return you only links that HAVE name and/or id.
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|
Oh, thanks for your reply... First, I'm not sure which line I get an exception, I try to debug line by line, sometimes it throws an exception on line "SysFreeString", and sometimes it throws an exception on my function called "ParseURL", but it never happens if I try my function in the way of "ParseString" which is shown on the website:http://www.codeguru.com/ieprogram/HTMLParsing.html, but in that way it causes memory leak.
And second, I don't know how to use smart pointers, I try to release any interfaces by myself, if it throws an exception by release function, it shows that 's a smart pointers, right? And if I try MSHTML::IHTMLElementCollection2Ptr, I don't know how to call its function get_links, so I have to use MSHTML::IHTMLElementCollection, can you give me more good suggestions,please??
And the last,as you see, I want to get links in document,if get_links can only return links that have name and/or id, how can I get others? Do you know?
Oh, yes, there's one more important question I want to ask you. When I try to parse html in the ParseString way as being shown on http://www.codeguru.com/ieprogram/HTMLParsing.html, it never open new ie windows, but when I try the way you prefer, and in a multithread subroutine, it news more and more ie windows, I have to close them one by one by click my mouse, I cann't stand, can you also solve this problem? Thanks for you help.....
|
|
|
|
|
Well. About smart pointers in MSHTML.
When you use smart pointer, it will manage memory and interface releasing by itself. Also as a bonus you will get a QueryInterface function nested inside smart pointer. So the code that u have:
IDispatch* pDisp;
IHTMLElement *pLink;
pDisp = pAllLink->item(varIndex,(long)0);
pDisp->QueryInterface(IID_IHTMLElement,(void **)&pLink);
will look like this:
MSHTML::IHTMLElementPtr pLink;
pLink = pAllLink->item(varIndex, (long)0);
What is going on here? Well, item() function still returns IDispatch , but when in code you are trying to assign it to smart pointer (pLink here), it calls to QueryInterface() internally and obtains a pointer to this interface. So if you see that pLink is not NULL, that says that QueryInterface() succeeded.
Also smart pointer is a simple class (that wraps an interface), thus, when it goes out of scope (end of function here), it calls to Release() in its destructor (do not do it by yourself, if you want to release an interface in smart pointer, just assign NULL to it - pLink = NULL )
This is about smart pointers. I personally don't like long messages, so I'll continue in next one
Philip Patrick
"Two beer or not two beer?" (Shakesbeer)
Web-site: www.saintopatrick.com
|
|
|
|
|