|
Okay. So I really should have something like this?
[Node1][EOL]"\n"[/EOL]
[Data2]Here is data[/Data2][EOL]"\n"[/EOL]
[/Node1][EOL]"\n"[/EOL]
where the EOL text nodes are effectively "hidden" because they only show up as the newlines?
I guess that makes sense, but it sure isn't what I thought it should be. This, of course, is only of use if I really do need to look at or modify the file in an editor. If I just use the web browser to view the file, none of this matters. And, if I only access the file through the application, then none of it matters.
Thanks,
Dave
"You can say that again." -- Dept. of Redundancy Dept.
|
|
|
|
|
First my appoligies for not being a better writer. I am not getting the story across very well. An element is a node but a node may not be an element. or a node could be an element. It could also be text, processing instructions, comments, etc.
In your example:
David Chamberlain wrote:
[Node1][EOL]"\n"[/EOL]
[Data2]Here is data[/Data2][EOL]"\n"[/EOL]
[/Node1][EOL]"\n"[/EOL]
you have added child elements to the element called node1. What is missing are the child nodes that are not elements, not adding more elements. I guess it has been to many months since I stepped through the MS DOM model.
What I did to finally get a better feel of what was going on in the MS DOM model was to create a simple dialog app that when a button was pressed created a MSDOM instance and read in a XML file. I then added code that found the root element of the document and sent it to a function that would get the list of child nodes and looked at what types they were. When I found a node that was an element I recursed back to the function. It helped me see all of the items that were existed. I am not sure if I have this save or not.
I left it there and concluded for my needs the class I had written worked fine and I would use it for any manipulation of XML files. I do use the MS DOM to read in files as well as some of the Apache code.
Good ideas are not adopted automatically.
They must be driven into practice with courageous patients. -Admiral Rickover. ...
|
|
|
|
|
Let me first say that I really appreciate your help, even though what appears to be a simple matter has become quite complicated. Thanks to MS, I'm sure.
So, without creating new child elements, I should just create additional text nodes, ending up like this:
[Node1] (Node)
"\n" (Text, child 1 of Node1)
[Data2] (Node, child 2 of Node1)
"Here is data" (Text, child 1 of Data2)
[/Data2]
"\n" (Text, child 3 of Node1)
[/Node1]
"\n" (Text, child ? of parent-of-Node1)
While I hate the vocabulary of "nodes" and "elements," the only real difference I could see was that "elements" allow access to attributes while "nodes" do not. Either one can have children.
Dave
"You can say that again." -- Dept. of Redundancy Dept.
|
|
|
|
|
In general yes to adding the text nodes.
Neville's comment about some control options appears to be correct and I just had not noticed. In the following code at one time I received all of the nodes and now I do not receive the nodes that only contain white spaces. I.E. exactly the point you made about missing the formating!!!
It is gone now.
Hopefully this is a start.
The first function initializes the process and reads in a specific file.
The second function then steps through it is two different ways.
If you experiment with putting non-white space text data in with elements between I think you will see my comment.
I am using Win2k with MSXML 4 and tried this out on WinMe also with MSXML 4. Previously I had run something simmilar but with only MSXML 3 installed. If that is the difference or not I can not say.
Take Care
void CMsDomTestDlg::OnButtonread()
{
row=0;
CComVariant varFileName = (LPCSTR)"ourtest.xml";
VARIANT_BOOL varOkay;
HRESULT hr;
IXMLDOMDocument *pXML = NULL;
hr = CoCreateInstance(CLSID_DOMDocument, NULL, CLSCTX_INPROC_SERVER,
IID_IXMLDOMDocument2, (void**)&pXML);
ASSERT(SUCCEEDED(hr) && pXML!=NULL);
hr = pXML->load(varFileName,&varOkay);
IXMLDOMElement *pRoot;
if(SUCCEEDED(hr))
{
hr = pXML->get_documentElement(&pRoot);
if(SUCCEEDED(hr)&&pRoot!=NULL)
{
LoadChildren((IXMLDOMNode*)pRoot, 0);
}
else
{
m_NodeGrid.SetItemText(row,0,"Model Not Read In");
}
}
Invalidate(TRUE);
}
void CMsDomTestDlg::LoadChildren(IXMLDOMNode* pNode, int depth)
{
HRESULT hr;
IXMLDOMNode *child;
BOOL Method1 = FALSE;
CString data;
CString td;
CComBSTR txt;
long listlen,listpos;
DOMNodeType type;
IXMLDOMNodeList *childlist;
if(Method1)
{
hr = pNode->get_firstChild(&child);
while(SUCCEEDED(hr)&&child!=NULL)
{
row++;
data.Format("%d",depth);
m_NodeGrid.SetItemText(row,0,data);
hr = child->get_nodeType(&type);
data.Format("%d",type);
m_NodeGrid.SetItemText(row,1,data);
hr = child->get_text(&txt);
if(SUCCEEDED(hr))
{
td = txt;
data.Format("%s length of %d",td,td.GetLength());
m_NodeGrid.SetItemText(row,2,data);
}
else
{
data = "No Text Data";
m_NodeGrid.SetItemText(row,2,data);
}
hr = child->get_baseName(&txt);
if(SUCCEEDED(hr))
{
data = txt;
m_NodeGrid.SetItemText(row,3,data);
}
else
{
data = "No Base Name";
m_NodeGrid.SetItemText(row,3,data);
}
if(type == NODE_ELEMENT)
{
LoadChildren(child,depth+1);
}
hr = child->get_nextSibling(&child);
}
}
else
{
hr = pNode->get_childNodes(&childlist);
childlist->get_length(&listlen);
for(listpos=0;listpos<listlen;listpos++)
{
hr =="" childlist-="">get_item(listpos,&child);
if(SUCCEEDED(hr)&&child!=NULL)
{
row++;
data.Format("%d",depth);
m_NodeGrid.SetItemText(row,0,data);
hr = child->get_nodeType(&type);
data.Format("%d",type);
m_NodeGrid.SetItemText(row,1,data);
hr = child->get_text(&txt);
if(SUCCEEDED(hr))
{
data = txt;
m_NodeGrid.SetItemText(row,2,data);
}
else
{
data = "No Text Data";
m_NodeGrid.SetItemText(row,2,data);
}
hr = child->get_baseName(&txt);
if(SUCCEEDED(hr))
{
data = txt;
m_NodeGrid.SetItemText(row,3,data);
}
else
{
data = "No Base Name";
m_NodeGrid.SetItemText(row,3,data);
}
if(type == NODE_ELEMENT)
{
LoadChildren(child,depth+1);
}
}
}
}
}
Good ideas are not adopted automatically.
They must be driven into practice with courageous patients. -Admiral Rickover. ...
|
|
|
|
|
From MSXML 4 documentation
When a text file is opened with the xmlDoc.load method or the xmlDoc.loadXML method (where xmlDoc is an XML DOM document), the parser strips most white space from the file, unless specifically directed otherwise. The parser notes within each node whether one or more spaces, tabs, newlines, or carriage returns follow the node in the text by setting a flag. This method is efficient, reducing both the size of each XML file and the number of calculations required to redisplay the XML in a browser. However, because this information is lost, an XML document stored in this manner can lose formatting information in its content. Tabs, in particular, can be lost, because they are not formally recognized in the default mode as anything but white space.
hr = pXML->put_preserveWhiteSpace(VARIANT_TRUE);
Place this before the load.
And now they are spaces are back
Good ideas are not adopted automatically.
They must be driven into practice with courageous patients. -Admiral Rickover. ...
|
|
|
|
|
Michael A. Barnhart wrote:
However, because this information is lost, an XML document stored in this manner can lose formatting information in its content.
First, I sure didn't expect such investigation into this seemingly trivial matter, but I certainly appreciate all the input and help.
Apparently, the preserve white space option is on by default. I had created an XML file in the Visual Studio IDE in order to plan out the structure and content of the file that would eventually be manipulated and maintained by the application program. Once I had that file, I would call the 'load' function, and then let the application do its thing, one operation being the creation of new nodes, as described in the earlier posts. At the end of execution, and calling 'save', I would then load the file back into the IDE to see what happened, and to check that the application created the new nodes properly.
At that point, what I was seeing was the same file as I had originally created, with all the 'formatting' (spaces, tabs, and newlines) properly still existant, but the new nodes would appear on a single line. They would be in the proper location in the file, in terms of being after the last child of the node being added to, but there were no new lines.
Therefore, although I haven't updated the implementation yet, I believe that the previous suggestion about adding text nodes with newlines (and spaces or tabs if I decide to add those too) will indeed place those into the file and will be preserved upon subsequent 'load' and 'save' operations, by default, even without calling 'preserve white space'.
This particular application is running on Win98 with msxml3, although I plan to update that to msxml4 for the speed and memory considerations.
I also appreciate your code, as seeing how things are done is the best teacher. But, unfortunately, and probably as no surprise, that raises a few more questions.
Based on one of the XML sample projects on CP, I am using the #import [msxml3.dll] in the header file. While it seems to me that the following should be equivalent, one worked and one did not. While I am not familiar with the intricacies of COM, I went with the one that worked.
(1) IXMLDOMNode *pNode;
pNode = m_pXmlDoc->selectNode ("StartTag");
(2) IXMLDOMNodePtr pNode;
pNode = m_pXmlDoc->selectNode ("StartTag");
According to the contents of the generated .tlh file, the selectNode function returns an IXMLDOMNodePtr, and option 1 bombs. While I don't really understand the internal difference, by following the .tlh contents, I was able to get all of the function calls and return values to be of the proper type and operate without bombing. I guess there is not really a question here, other than does this really make any difference, or am I going down a wrong path?
Thanks again for all the help.
Dave
"You can say that again." -- Dept. of Redundancy Dept.
|
|
|
|
|
Dave,
I think you are well along the right path. Good luck
David Chamberlain wrote:
Apparently, the preserve white space option is on by default.
Dave,
As I said earlier my code worked differently awhile back. I now have MSXML4 installed that the default appears to not include the white spaces. So I would go ahead and add that line in unless your memeory is much better than mine.
And go with what works. In COM you have interfaces. So a pointer to an interface is not the same thing as an interface to a pointer to XYZ. Don't we love this.
I needed a little refresher especially with what I learned for differences between 3 and 4.
Take Care and have a nice day. Mike
Good ideas are not adopted automatically.
They must be driven into practice with courageous patients. -Admiral Rickover. ...
|
|
|
|
|
I have 18000 items and it takes about 10 minutes to parse my xml-file, but only a few seconds if I comment out line 6.
What I'm doing wrong? How can speed-up the following?
1. Set oXMLNodeList = oXMLElement.selectNodes("data/item")
2. Put #fTestFile, , "Count: " & oXMLNodeList.Length & vbCrLf
3. For Each oItemY In oXMLNodeList
4. Put #fTestFile, , vbCrLf
5. For Each oItemX In oItemY.childNodes
6. Put #fTestFile, , " ;" & oItemX.nodeTypedValue
7. Next oItemY
8. Next oItemY
|
|
|
|
|
I do not work with VB so this is somewhat of an outside observation and could be a worthless comment but:
I would have added a line 4.5
Set oYList = oItemY.childNodes assuming this is valid in VB.
Potentially you may be copying the data over and over, I have seen this impact.
Then line 5 would be For Each oItemX in OYList
Line 7 should be Next oItemX
by commenting out line 6 you are never actually addressing the data.
Good ideas are not adopted automatically.
They must be driven into practice with courageous patients. -Admiral Rickover. ...
|
|
|
|
|
I found this page a few days ago and feel it is an interesting example that shows transformations without usage of what I see (and use) typically employing the
|
|
|
|
|
Given the following xml:
<para>The <link url="earth.htm">earth</link> rotates.</para>
Is there a way to use XSLT to write the content of the <para> element like this:
<p>The <a href="earth.htm">earth</a> rotates.</p>
|
|
|
|
|
Something along the lines of:
<xsl:template match="link">
<a>
<xsl:attribute name="href"><xsl:value-of select="@url" /></xsl:attribute>
<xsl:value-of select="."/>
</a>
</xsl:template>
I think. Its been a while since I have done XSLT.
--
David Wengier
Sonork ID: 100.14177 - Ch00k
|
|
|
|
|
That part works if you only want to show links. What I was wondering was if there was a simple way to put the link inside the rest of the text, i.e. in my example the result I wanted was The earth rotates.
|
|
|
|
|
MarSCoZa wrote:
<para>The earth rotates.
Technically that kind of XML is not really valid.
e.g.
<para>
The
<link url="earth.htm">earth</link>
rotates.
</para>
The The and Rotates PCDATA sections are "floating".
The reason being that an element can either only contain PCDATA or child-elements. It cannot contain both. Think about creating a DTD which defines that... You cannot really. The DTD element definition cannot contain PCDATA and an elements name.
However XML is quite forgiving in this case (which is strange considering it's normally very unforgiving nature) and you can use the following XSL to transform it.
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<html>
<head>
</head>
<body>
<xsl:apply-templates />
</body>
</html>
</xsl:template>
<xsl:template match="link">
<a>
<xsl:attribute name="href">
<xsl:value-of select="@url" />
</xsl:attribute>
<xsl:value-of select="."/>
</a>
</xsl:template>
</xsl:stylesheet>
The key is templates . I used to hate them and thought they were these daft, never used bits of XSL. Until I figured out how they worked and went "oooohh yes!"
Enjoy!
regards,
Paul Watson
Bluegrass
Cape Town, South Africa
The greatest thing you'll ever learn is just to love, and to be loved in return - Moulin Rouge
|
|
|
|
|
Paul Watson wrote:
The key is templates. I used to hate them and thought they were these daft, never used bits of XSL. Until I figured out how they worked and went "oooohh yes!"
I've never used XSLT for XML->HTML, but with XML->XML, templates are the god you bow down to and pray. It boggles my mind the thought of NOT using template matching in XSLT.
|
|
|
|
|
I have an XML document which I need to turn into a new format, where a small portion of it gets put in <static> tags, the rest does not. There is a LOT of XML on this page.
I've got the first half done, easy enough. Now I want to write a filter which, instead of saying 'include these tags and children', I want to say 'EXCLUDE these tages and children, include everything else'. How do I do that ?
Christian
The tragedy of cyberspace - that so much can travel so far, and yet mean so little.
"I'm thinking of getting married for companionship and so I have someone to cook and clean." - <b>Martin Marvinski, 6/3/2002</b>
|
|
|
|
|
Which language and parser are you using to process the XML?
|
|
|
|
|
I'm using XSL, I got it working this afternoon - thanks.
Christian
The tragedy of cyberspace - that so much can travel so far, and yet mean so little.
"I'm thinking of getting married for companionship and so I have someone to cook and clean." - Martin Marvinski, 6/3/2002
|
|
|
|
|
Christian Graus wrote:
I got it working this afternoon - thanks.
How did you do it? I might run into that problem one day and need to know how, thanks
regards,
Paul Watson
Bluegrass
Cape Town, South Africa
"The greatest thing you will ever learn is to love, and be loved in return" - Moulin Rouge
Sonork ID: 100.9903 Stormfront
|
|
|
|
|
My first pass would have been to use the count(tag) function and only processed if it returned 0.
Good ideas are not adopted automatically.
They must be driven into practice with courageous patients. -Admiral Rickover. ...
|
|
|
|
|
The trick is that a more generic filter will only include items that were not included in a more specific filter. I wrote specific filters for the tages I needed, then a generic filter, and it excluded all the items that the specific tags caught.
If that's not clear LMK, I can post the code from work tomorrow.
Christian
The tragedy of cyberspace - that so much can travel so far, and yet mean so little.
"I'm thinking of getting married for companionship and so I have someone to cook and clean." - Martin Marvinski, 6/3/2002
|
|
|
|
|
Anybody know how to use Voice XML ?
What do I need, how do I set it up ?
Any info would be useful.
Users.
Can't live with 'em, can't kill em!
|
|
|
|
|
VoiceXML is just a standard. Go here and grab the specs. Then, grab the parser ang language of your choice and start writing a processor.
Of course, there might be ones out there. I'm going by the work we did about a year ago for a copy that used VoiceXML. We had to write our own processor.
J
|
|
|
|
|
I need to validate an XML document against a schema, but I'm not having much luck finding a complete example in MSDN.
This is the pseudo code of what I want to do:
<br />
XMLValidater xVal = new XMLValidater("my dtd, etc...");<br />
XMLDocument xDoc = new XMLDocument("my XML doc");<br />
if (xVal.Validate(xDoc))<br />
Help would be cool.
Cheers,
Simon
X-5 452 rules.
|
|
|
|
|
I haven't looked at it but isn't XmlValidatingReader what you need?
|
|
|
|
|