Introduction
This article discusses the new C++ Elmax XML Library feature to use Linq-To-XML node creation to write XML files. Currently, there is no plans to implement this feature for C# Elmax. C# users can use .NET Linq-To-XML to achieve the same XML writing. For those readers who might want to learn more about Elmax XML library, they may read this tutorial article and the documentation but their reading are not required to understand this article. The intended audience for this article, are XML library authors who may be interested in implementing this Linq-To-XML node creation feature for their XML libraries. Though Linq-To-XML node creation has already been mentioned several times, C++ programmers who work primarily in native C++, may be not familiar with Linq-To-XML node creation syntax and what it does and how it does it. Linq-To-XML node creation, simply said, is the natural way to create nodes with code structurally identical to resultant XML. To prove my point, I will show a .NET C# Linq-To-XML node creation code snippet to add a movie information to movies element.
using System.Xml.Linq;
XElement movies = new XElement("Movies");
movies.Add(
new XElement("Movie",
new XAttribute("Name", "Transformers: Dark of the Moon"),
new XAttribute("Year", "2011"),
new XAttribute("RunningTime", 157.ToString()),
new XElement("Director", "Michael Bay"),
new XElement("Stars",
new XElement("Actor", "Shia LaBeouf"),
new XElement("Actress", "Rosie Huntington-Whiteley")
),
new XElement("DVD",
new XElement("Price", "25.00"),
new XElement("Discount", (0.1).ToString())
),
new XElement("BluRay",
new XElement("Price", "36.00"),
new XElement("Discount", (0.1).ToString())
)
)
);
XDocument doc = new XDocument(
new XDeclaration("1.0", "utf-8", ""),
movies);
doc.Save(@"C:\Temp\Movies1.xml");
For reader's information, the Visual Studio IDE will automatically indent your C# Linq-To-XML node creation code for you when you hit the enter key. The Movies1.xml
output looks similar to what is displayed right below.
="1.0"="utf-8"
<Movies>
<Movie Name="Transformers: Dark of the Moon" Year="2011" RunningTime="157">
<Director>Michael Bay</Director>
<Stars>
<Actor>Shia LaBeouf</Actor>
<Actress>Rosie Huntington-Whiteley</Actress>
</Stars>
<DVD>
<Price>25.00</Price>
<Discount>0.1</Discount>
</DVD>
<BluRay>
<Price>36.00</Price>
<Discount>0.1</Discount>
</BluRay>
</Movie>
</Movies>
This is not difficult to visualize how the XML would look like from the C# code. In the next section, we shall compare the new Linq-To-XML and the original Elmax node creation.
Comparison of the New Linq-To-XML and the Old Elmax Node Creation
I guess by right now, readers are eager to see the Linq-To-XML syntax for C++. Without further delay, the code is displayed at below.
using namespace Elmax;
NewElement movies(L"Movies");
movies.Add(
NewElement(L"Movie",
NewAttribute(L"Name", L"Transformers: Dark of the Moon"),
NewAttribute(L"Year", L"2011"),
NewAttribute(L"RunningTime", ToStr(157)),
NewElement(L"Director", L"Michael Bay"),
NewElement(L"Stars",
NewElement(L"Actor", L"Shia LaBeouf"),
NewElement(L"Actress", L"Rosie Huntington-Whiteley")
),
NewElement(L"DVD",
NewElement(L"Price", L"25.00"),
NewElement(L"Discount", ToStr(0.1))
),
NewElement(L"BluRay",
NewElement(L"Price", L"36.00"),
NewElement(L"Discount", ToStr(0.1))
)
)
);
movies.Save(L"C:\\Temp\\Movies2.xml", L"1.0", true);
As the reader may notice, the C++ syntax does not allocate the elements on the heap using the new keyword, unlike the C# version; in other words, the elements are allocated on the stack. C# Linq-To-XML allocates the elements on the heap which needs to be garbage-collected by the garbage-collector which hurts performance and requires more memory. For elements allocated on the stack, we do not have this massive memory consumption problem because they are popped off the stack immediately when the elements goes out of scope.
Underneath the surface, the memory is still allocated on the heap to construct the internal tree structure. Then the internal tree structure is converted to MS XML DOM elements recursively in the Save
method. Just before the Save
method returns, the internal tree structure is destroyed. If user wants to retain the tree structure for either another Save
call or append the tree structure to a larger tree structure, he/she might not want to destroy the tree structure during Save
; he/she can specify false
for discard
argument (default value is true
) in the Save
method.
bool Save(
const std::wstring& file,
const std::wstring& xmlVersion,
bool utf8,
bool discard = true);
bool PrettySave(
const std::wstring& file,
const std::wstring& xmlVersion,
bool utf8,
const std::wstring& indent = L" ",
bool discard = true);
By now, the reader may be curious to know how the original Elmax node creation stack up against the new Linq-To-XML node creation syntax. The example below shows how to save the same Movies2.xml
, using original Elmax code.
MSXML2::IXMLDOMDocumentPtr pDoc;
HRESULT hr = CreateAndInitDom(pDoc);
if (SUCCEEDED(hr))
{
using namespace Elmax;
Element root;
root.SetConverter(NORMAL_CONV);
root.SetDomDoc(pDoc);
Element movies = root[L"Movies"];
Element movie = movies[L"Movie"].CreateNew();
movie.Attribute(L"Name") = L"Transformers: Dark of the Moon";
movie.Attribute(L"Year") = L"2011";
movie.Attribute(L"RunningTime") = 157;
movie[L"Director"] = L"Michael Bay";
movie[L"Stars|Actor"] = L"Shia LaBeouf";
movie[L"Stars|Actress"] = L"Rosie Huntington-Whiteley";
movie[L"DVD|Price"] = L"25.00";
movie[L"DVD|Discount"] = 0.1;
movie[L"BluRay|Price"] = L"36.00";
movie[L"BluRay|Discount"] = 0.1;
SaveXml(L"C:\\Temp\\Movies3.xml", L"1.0", true);
}
As the reader can see that it can be hard to discern the structure of the XML just by casually glancing at the original Elmax code of node creation.
How the Library is Written
Surprisingly, the Linq-To-XML node creation library code is very simple and can be written under a couple of hours. To create nodes, using the new syntax, we are required to use NewElement
, NewAttribute
, NewCData
and NewComment
class. These new classes are derived from NewNode
class and they do most of their useful work in their constructors.
This is the code listing for the declaration of NewElement
class.
class NewElement : public NewNode
{
public:
~NewElement(void);
NewElement operator[](LPCWSTR name);
NewElement operator[](LPCSTR name);
bool Exists() { return GetPtr()!=NULL; }
NewElement(const NewElement& other);
NewElement& operator=(const NewElement& other);
NewElement();
NewElement(const std::wstring& name);
NewElement(const std::wstring& name,
const std::wstring& sValue);
NewElement(const std::wstring& name, NewNode& node1);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2, NewNode& node3);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2, NewNode& node3,
NewNode& node4);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2, NewNode& node3,
NewNode& node4, NewNode& node5);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2, NewNode& node3,
NewNode& node4, NewNode& node5,
NewNode& node6);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2, NewNode& node3,
NewNode& node4, NewNode& node5,
NewNode& node6, NewNode& node7);
NewElement(const std::wstring& name, NewNode& node1,
NewNode& node2, NewNode& node3,
NewNode& node4, NewNode& node5,
NewNode& node6, NewNode& node7,
NewNode& node8);
NewElement Add(NewNode& node1);
NewElement Add(NewNode& node1, NewNode& node2);
NewElement Add(NewNode& node1, NewNode& node2,
NewNode& node3);
NewElement Add(NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4);
NewElement Add(NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4,
NewNode& node5);
NewElement Add(NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4,
NewNode& node5, NewNode& node6);
NewElement Add(NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4,
NewNode& node5, NewNode& node6,
NewNode& node7);
NewElement Add(NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4,
NewNode& node5, NewNode& node6,
NewNode& node7, NewNode& node8);
bool Save(MSXML2::IXMLDOMDocumentPtr& ptrDoc,
const std::wstring& file, bool discard = true);
bool PrettySave(MSXML2::IXMLDOMDocumentPtr& ptrDoc,
const std::wstring& file, bool discard = true);
bool Append(NewTreeNode* child);
private:
NewElement Find(const std::wstring& names);
NewElement FindFirstChild(const std::wstring& name);
};
The code listing of the overloaded constructor which takes in 8 NewNode
parameters is listed here.
NewElement::NewElement(const std::wstring& name,
NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4,
NewNode& node5, NewNode& node6,
NewNode& node7, NewNode& node8)
{
Init();
NewTreeNode* ptr = GetPtr();
if(ptr)
{
ptr->xmltype = XML_ELEMENT;
ptr->pName = name;
NewTreeNode* tmpPtr = node1.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node2.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node3.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node4.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node5.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node6.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node7.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node8.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
}
}
The code listing of the overloaded Add
method with 8 NewNode
parameters is listed here.
NewElement NewElement::Add(
NewNode& node1, NewNode& node2,
NewNode& node3, NewNode& node4,
NewNode& node5, NewNode& node6,
NewNode& node7, NewNode& node8)
{
NewTreeNode* ptr = GetPtr();
if(ptr)
{
NewTreeNode* tmpPtr = node1.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node2.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node3.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node4.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node5.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node6.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node7.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
tmpPtr = node8.GetPtr();
if(tmpPtr!=NULL)
Append(tmpPtr);
}
return *this;
}
As you can see, NewElement
constructors and its Add
methods do nothing except appending the nodes to the vector
. Below is the code listing for the declaration of NewAttribute
class and definition of its only constructor.
class NewAttribute : public NewNode
{
public:
NewAttribute(const std::wstring& name,
const std::wstring& sValue);
~NewAttribute(void);
};
NewAttribute::NewAttribute(const std::wstring& name,
const std::wstring& sValue)
{
Init();
NewTreeNode* ptr = GetPtr();
if(ptr)
{
ptr->xmltype = XML_ATTRIBUTE;
ptr->pName = name;
ptr->pValue = sValue;
}
}
This is the code listing for the declaration of NewCData
class and definition of its only method: its constructor.
class NewCData : public NewNode
{
public:
NewCData(const std::wstring& sValue);
~NewCData(void);
};
NewCData::NewCData(const std::wstring& sValue)
{
Init();
NewTreeNode* ptr = GetPtr();
if(ptr)
{
ptr->xmltype = XML_CDATA;
ptr->pValue = sValue;
}
}
This is the code listing for the declaration of NewComment
class and definition of its constructor.
class NewComment : public NewNode
{
public:
NewComment(const std::wstring& sValue);
~NewComment(void);
};
NewComment::NewComment(const std::wstring& sValue)
{
Init();
NewTreeNode* ptr = GetPtr();
if(ptr)
{
ptr->xmltype = XML_COMMENT;
ptr->pValue = sValue;
}
}
The reader may ask the author why he chose to create new classes to do this, instead of modifying the old classes like Element
, Attribute
, CData
and Comment
. The reason is because these original classes contain many data members; To construct these class excessively on the stack and pop them out of the stack, would seriously hurt performance. As you would see from the above listing for new classes, I did not list their data member. That's because their only data member is ptr
which exists in their base class, NewNode
.
class NewNode
{
public:
NewNode(void);
~NewNode(void);
NewTreeNode* GetPtr() const {return ptr;}
void SetPtr(NewTreeNode* src) { ptr = src; }
void Init();
void Discard();
private:
NewTreeNode* ptr;
};
ptr
is of type NewTreeNode
. I had intended to name this tree structure, TreeNode
but TreeNode
is a reserved keyword in Visual C++ 10 because there is another TreeNode
class defined in Visual C++ libraries.
enum XMLTYPE
{
XML_NONE,
XML_ELEMENT,
XML_ATTRIBUTE,
XML_COMMENT,
XML_CDATA
};
class NewTreeNode
{
public:
NewTreeNode(void);
~NewTreeNode(void);
std::vector<NewTreeNode*> vec;
std::wstring pName;
std::wstring pValue;
XMLTYPE xmltype;
static bool Traverse(MSXML2::IXMLDOMDocumentPtr& ptrDoc,
MSXML2::IXMLDOMNodePtr& parent, NewTreeNode* pNode);
void Delete();
};
NewTreeNode
has Traverse
method which creates MS XML DOM element as it traverse the tree recursively and it also has a Delete method which deletes the tree structure recursively. You see, to allocate and deallocate NewNode/NewElement
objects on the stack, it is only a matter of pushing and popping 64bit/32bit pointers. Compare this in contrast to pushing and poping the heavy-duty Element
class which contains these many data members below. For reader information, though the 64bit/32bit pointer is popped whenever NewNode
object goes out of scope, the tree data which the pointer is pointed to, still lives on until they are saved to a file on disk.
class Element
{
private:
BaseConverter* m_pIConverter;
std::wstring m_strTemp;
std::string m_asciiStrTemp;
std::wstring m_strNonExistingParent;
MSXML2::IXMLDOMDocumentPtr m_ptrDoc;
MSXML2::IXMLDOMNodePtr m_ptrNode;
bool m_bDeleted;
std::wstring m_strName;
bool m_bValid;
bool m_bRoot;
};
The source code listing of the recursive methods of Traverse
and Delete
is provided for the reader's perusal.
bool NewElement::Traverse(NewTreeNode& node, CUnicodeFile& uf, bool utf8)
{
if(node.xmltype==XML_ELEMENT)
{
WriteStartElement(uf, utf8, node.pName);
bool attrWritten = false;
for(size_t i=0;i<node.vec.size(); ++i)
{
NewTreeNode* node1 = node.vec[i];
if(node1->xmltype==XML_ATTRIBUTE)
{
std::wstring str = L" ";
str += node1->pName + L"=\"";
str += EscapeXML(node1->pValue);
str += L"\"";
Write(uf, utf8, str);
continue;
}
else
{
if(attrWritten == false)
{
Write(uf, utf8, L">");
attrWritten = true;
}
}
Traverse(*node1, uf, utf8);
}
if(node.vec.size()==0)
Write(uf, utf8, L">");
if(node.pValue.empty()==false)
{
std::wstring str = EscapeXML(node.pValue);
Write(uf, utf8, str);
}
WriteEndElement(uf, utf8, node.pName);
}
else if(node.xmltype==XML_COMMENT)
{
std::wstring str = L"<!--";
str += node.pValue;
str += L"-->";
Write(uf, utf8, str);
}
else if(node.xmltype==XML_CDATA)
{
std::wstring str = L"<![CDATA[";
str += node.pValue;
str += L"]]>";
Write(uf, utf8, str);
}
return true;
}
void NewTreeNode::Delete()
{
for(size_t i=0;i<vec.size();++i)
vec.at(i)->Delete();
vec.clear();
delete this;
}
How About Linq-To-XML Query?
While Elmax does not support Linq-To-XML style queries, it has some powerful query mechanism which is based on Lambda
(anonymous function) to decide which elements to fetch back. Let me acquaint you with some of Elmax query mechanism.
Elmax has AsCollection
and GetCollection
methods which fetches a collection of siblings of the same name and fetches a collection of children of the same name, respectively. They both have an overloaded version which takes in an additional Lambda
as predicate to filter the elements you want.
typedef std::vector< Element > collection_t;
collection_t AsCollection();
template<typename Predicate>
collection_t AsCollection(Predicate pred);
collection_t GetCollection(const std::wstring& name);
template<typename Predicate>
collection_t GetCollection(const std::wstring& name,
Predicate pred);
Elmax provides HyperElement
class which allows joining elements with another element which satisfies certain criteria. For example, in a Books application, Book
element under the main Books
section will be joined with the Author
element (through AuthorID
) under the main Authors
section to retrieve the author name for the books. Books
section and Authors
section are 2 separate sections. A sample of the XML is provided below.
="1.0"="UTF-16"
<All>
<Version>1</Version>
<Books>
<Book ISBN="1111-1111-1111">
<Title>2001: A Space Odyssey</Title>
<Price>12.990000</Price>
<AuthorID>111</AuthorID>
</Book>
<Book ISBN="2222-2222-2222">
<Title>Rendezvous with Rama</Title>
<Price>15.000000</Price>
<AuthorID>111</AuthorID>
</Book>
<Book ISBN="3333-3333-3333">
<Title>Foundation</Title>
<Price>10.000000</Price>
<AuthorID>222</AuthorID>
</Book>
<Book ISBN="4444-4444-4444">
<Title>Currents of Space</Title>
<Price>11.900000</Price>
<AuthorID>222</AuthorID>
</Book>
<Book ISBN="5555-5555-5555">
<Title>Pebbles in the Sky</Title>
<Price>14.000000</Price>
<AuthorID>222</AuthorID>
</Book>
</Books>
<Authors>
<Author Name="Arthur C. Clark" AuthorID="111">
<Bio>Sci-Fic author!</Bio>
</Author>
<Author Name="Isaac Asimov" AuthorID="222">
<Bio>Sci-Fic author!</Bio>
</Author>
</Authors>
</All>
This is the HyperElement
class with Lambda in action!
auto vec = HyperElement::JoinOneToMany(
authors.GetCollection(L"Author"), books.GetCollection(L"Book"),
[](Elmax::Element x, Elmax::Element y)->bool
{
if(x.Attribute("AuthorID").GetString("a") ==
y[L"AuthorID"].GetString("b") )
{
return true;
}
return false;
});
for(size_t i=0; i< vec.size(); ++i)
{
dp.Print(L"List of books by {0}\n",
vec[i].first.Attribute(L"Name").GetString(""));
dp.Print(L"=======================================\n");
for(size_t j=0; j< vec[i].second.size(); ++j)
{
dp.Print(L"{0}\n",
vec[i].second[j][L"Title"].GetString("None"));
}
dp.Print(L"\n");
}
This is the output. For more information on HyperElement
, please refer to Elmax documentation.
List of books by Arthur C. Clark
=============================================
2001: A Space Odyssey
Rendezvous with Rama
List of books by Isaac Asimov
=============================================
Foundation
Currents of Space
Pebbles in the Sky
In addition to these 2 methods of query, Elmax supports XPath expression through its various SelectNode
methods.
Adding Beyond 16 Nodes
The overloaded constructors and Add
methods of NewElement are ranged from taking 1 NewNode object to maximum 16 NewNode objects. What if the user need to add more than 16 nodes (like 17) for each element? Ans: he/she can use the Add
method because Add
method returns itself though (*this). Let me show you an example of adding 32 sub-elements to an element without using for-loop
. In practice, a for-loop
is the preferred method for adding elements more than 16.
NewElement hollywood(L"Hollywood");
hollywood.Add(
NewElement(L"Stars",
NewElement(L"Actor", L"Johnny Depp"),
NewElement(L"Actor", L"Brad Pitt"),
NewElement(L"Actor", L"Leonardo DiCaprio"),
NewElement(L"Actor", L"Will Smith"),
NewElement(L"Actor", L"George Clooney"),
NewElement(L"Actor", L"Tom Cruise"),
NewElement(L"Actor", L"Matt Damon"),
NewElement(L"Actor", L"Orlando Bloom"),
NewElement(L"Actor", L"Bruce Willis"),
NewElement(L"Actor", L"Steve Carell"),
NewElement(L"Actress", L"Jennifer Aniston"),
NewElement(L"Actress", L"Jessica Alba"),
NewElement(L"Actress", L"Halle Berry"),
NewElement(L"Actress", L"Angelina Jolie"),
NewElement(L"Actress", L"Sandra Bullock"),
NewElement(L"Actress", L"Reese Witherspoon")
).Add(
NewElement(L"Actress", L"Jennifer Garner"),
NewElement(L"Actress", L"Julia Roberts"),
NewElement(L"Actress", L"Gwyneth Paltrow"),
NewElement(L"Actress", L"Meg Ryan"),
NewElement(L"Actress", L"Hillary Swank"),
NewElement(L"Actress", L"Uma Thurman"),
NewElement(L"Actress", L"Keira Knightley"),
NewElement(L"Actress", L"Meryl Streep"),
NewElement(L"Actress", L"Cameron Diaz"),
NewElement(L"Actress", L"Salma Hayek"),
NewElement(L"Actress", L"Penelope Cruz"),
NewElement(L"Actress", L"Nicole Kidman"),
NewElement(L"Actress", L"Michelle Pfeiffer"),
NewElement(L"Actress", L"Drew Barrymore"),
NewElement(L"Actress", L"Jennifer Lopez"),
NewElement(L"Actress", L"Catherine Zeta-Jones")
)
);
hollywood.Save(L"C:\\Temp\\Stars.xml", L"1.0", true);
There is another way to add more than 16 elements; There is an overloaded Add
method which takes in an lambda. This method is only available on Visual C++ 11. On earlier version of Visual C++ (such as Visual C++ 10), the method is disabled by a _MSC_VER
check due to lambda support in Visual C++ 10 is partially broken. Below is the definition of the Add
method.
NewElement Add(auto func(NewElement& parent)->void)
{
func(*this);
return *this;
}
Below is an example of how do we add more than 16 elements, using lambda. Note: parent
argument actually refers to hollywood
element.
using namespace Elmax;
NewElement hollywood(L"Hollywood");
hollywood.Add([](Elmax::NewElement &parent)->void {
using namespace Elmax;
NewElement elem = NewElement(L"Stars");
elem.Add(NewElement(L"Actor", L"Johnny Depp"));
elem.Add(NewElement(L"Actor", L"Brad Pitt"));
elem.Add(NewElement(L"Actor", L"Leonardo DiCaprio"));
elem.Add(NewElement(L"Actor", L"Will Smith"));
elem.Add(NewElement(L"Actor", L"George Clooney"));
elem.Add(NewElement(L"Actor", L"Tom Cruise"));
elem.Add(NewElement(L"Actor", L"Matt Damon"));
elem.Add(NewElement(L"Actor", L"Orlando Bloom"));
elem.Add(NewElement(L"Actor", L"Bruce Willis"));
elem.Add(NewElement(L"Actor", L"Steve Carell"));
elem.Add(NewElement(L"Actress", L"Jennifer Aniston"));
elem.Add(NewElement(L"Actress", L"Jessica Alba"));
elem.Add(NewElement(L"Actress", L"Halle Berry"));
elem.Add(NewElement(L"Actress", L"Angelina Jolie"));
elem.Add(NewElement(L"Actress", L"Sandra Bullock"));
elem.Add(NewElement(L"Actress", L"Reese Witherspoon"));
elem.Add(NewElement(L"Actress", L"Jennifer Garner"));
elem.Add(NewElement(L"Actress", L"Julia Roberts"));
elem.Add(NewElement(L"Actress", L"Gwyneth Paltrow"));
elem.Add(NewElement(L"Actress", L"Meg Ryan"));
elem.Add(NewElement(L"Actress", L"Hillary Swank"));
elem.Add(NewElement(L"Actress", L"Uma Thurman"));
elem.Add(NewElement(L"Actress", L"Keira Knightley"));
elem.Add(NewElement(L"Actress", L"Meryl Streep"));
elem.Add(NewElement(L"Actress", L"Cameron Diaz"));
elem.Add(NewElement(L"Actress", L"Salma Hayek"));
elem.Add(NewElement(L"Actress", L"Penelope Cruz"));
elem.Add(NewElement(L"Actress", L"Nicole Kidman"));
elem.Add(NewElement(L"Actress", L"Michelle Pfeiffer"));
elem.Add(NewElement(L"Actress", L"Drew Barrymore"));
elem.Add(NewElement(L"Actress", L"Jennifer Lopez"));
elem.Add(NewElement(L"Actress", L"Catherine Zeta-Jones"));
parent.Add(elem);
});
hollywood.Save(L"C:\\Temp\\Stars.xml", L"1.0", true);
This is what the Stars.xml
looks like after saving.
="1.0"="UTF-8"
<Hollywood>
<Stars>
<Actor>Johnny Depp</Actor>
<Actor>Brad Pitt</Actor>
<Actor>Leonardo DiCaprio</Actor>
<Actor>Will Smith</Actor>
<Actor>George Clooney</Actor>
<Actor>Tom Cruise</Actor>
<Actor>Matt Damon</Actor>
<Actor>Orlando Bloom</Actor>
<Actor>Bruce Willis</Actor>
<Actor>Steve Carell</Actor>
<Actress>Jennifer Aniston</Actress>
<Actress>Jessica Alba</Actress>
<Actress>Halle Berry</Actress>
<Actress>Angelina Jolie</Actress>
<Actress>Sandra Bullock</Actress>
<Actress>Reese Witherspoon</Actress>
<Actress>Jennifer Garner</Actress>
<Actress>Julia Roberts</Actress>
<Actress>Gwyneth Paltrow</Actress>
<Actress>Meg Ryan</Actress>
<Actress>Hillary Swank</Actress>
<Actress>Uma Thurman</Actress>
<Actress>Keira Knightley</Actress>
<Actress>Meryl Streep</Actress>
<Actress>Cameron Diaz</Actress>
<Actress>Salma Hayek</Actress>
<Actress>Penelope Cruz</Actress>
<Actress>Nicole Kidman</Actress>
<Actress>Michelle Pfeiffer</Actress>
<Actress>Drew Barrymore</Actress>
<Actress>Jennifer Lopez</Actress>
<Actress>Catherine Zeta-Jones</Actress>
</Stars>
</Hollywood>
Memory Leak Prevention
If you construct a NewElement
object and its children without saving, you will have memory leak. Because Save
method will delete internal tree structure after saving, user need to call Discard
method to delete the internal tree structure, if he/she, for some reason, decide not to save. User need to be careful here to avoid memory leak. I chose the option not to use smart pointer to store the tree structure for performance and memory reasons. I am not fond of the idea of using smart pointer in my code.
Points of Interest(SAX and ORM)
I am currently writing the SAX version of Elmax and also its article titled "The XML SAX Article that Programmers Should (not) be Reading" as a sequel to the original Elmax DOM article titled "The XML Parsing Article that Should (not) be Written". For reader who is not familiar with SAX XML; SAX stands for Simple API for XML. SAX simply reads a node at time during reading from a file. When writing to a file, SAX writes 1 node at a time. SAX does not store the XML node in a tree structure like XML DOM, thus SAX memory requirement to read a similar file is minimal compared to DOM. The Reader and Writer class of the SAX version is kept similar to the Elmax DOM version, whenever possible. For the SAX writer class, the Linq-To-XML node creation syntax is similar except for 1 additional requirement.
- For elements which are not created in the scope of its constructors or
Add
methods, WriteEndElement
needs to be called on them. For every XML element, there is always a start element stub(eg, <Book>
) and end element stub (eg, </Book>
) unless it does not have a value, (eg, <Book />
). The reason for this requirement is the SAX library has no way of knowing when the user stops adding child elements and wants to close it.
This is how the SAX version of movie code will look like, with the WriteEndElement
call.
using namespace Elmax::Writer;
NewElement movies(L"Movies");
movies.Add(
NewElement(L"Movie",
NewAttribute(L"Name", L"Transformers: Dark of the Moon"),
NewAttribute(L"Year", L"2011"),
NewAttribute(L"RunningTime", ToStr(157)),
NewElement(L"Director", L"Michael Bay"),
NewElement(L"Stars",
NewElement(L"Actor", L"Shia LaBeouf"),
NewElement(L"Actress", L"Rosie Huntington-Whiteley")
),
NewElement(L"DVD",
NewElement(L"Price", L"25.00"),
NewElement(L"Discount", ToStr(0.1))
),
NewElement(L"BluRay",
NewElement(L"Price", L"36.00"),
NewElement(L"Discount", ToStr(0.1))
)
)
);
movies.WriteEndElement(); movies.Save(L"C:\\Temp\\Movies4.xml", L"1.0", true);
So what is the rationale in keeping the DOM and SAX syntax similar? The reason are 2 fold. First of all, user does not need to learn a new syntax or totally new library to use SAX: Learning curve is lower. 2nd reason is I am writing a XML Object Relational Mapping (ORM) library using Elmax, when I keep the 2 syntax similar, then the ORM code generator for DOM and SAX Elmax would be similar to write as well. (Saves me some coding effort).
Uncommon Pitfall
I do not know if it is just me: When I use Linq-To-XML node creation, I made the mistake a few times of using names with whitespace for my elements and attributes. According to the XML specification, names with whitespace are simply not allowed. I rarely make this mistake while using other traditional ways of creating XML. This is perhaps due to Linq-To-XML syntax 'mixes' the name and value together: In the traditional API of creating XML, I will know very well whether I are specifying for a name or a value. If any of you have problems getting the XML out, please check if any of your element and attribute name has whitespace. I am pointing this out in case any of readers here share the same level of intelligence as the author.
Conclusion
We have looked at the different syntax of .NET C# Linq-To-XML, C++ Elmax Linq-To-XML and C++ Elmax original way of node creation. We have briefly discussed the internal workings of C++ Elmax Linq-To-XML node creation. We have also looked at ways to reduce memory consumption and eliminate memory leaks. Lastly I want to leave you with full code listings of each node creation method to add 4 movie information and save them to XML. Elmax is hosted at Codeplex: you can always get the latest version there. Any constructive feedback on the article, good or bad, is welcome.
Thank you for reading!
Code Listing
.NET C# Linq-To-XML node creation
XElement movies = new XElement("Movies");
movies.Add(
new XElement("Movie",
new XAttribute("Name", "Transformers: Dark of the Moon"),
new XAttribute("Year", "2011"),
new XAttribute("RunningTime", 157.ToString()),
new XElement("Director", "Michael Bay"),
new XElement("Stars",
new XElement("Actor", "Shia LaBeouf"),
new XElement("Actress", "Rosie Huntington-Whiteley")
),
new XElement("DVD",
new XElement("Price", "25.00"),
new XElement("Discount", (0.1).ToString())
),
new XElement("BluRay",
new XElement("Price", "36.00"),
new XElement("Discount", (0.1).ToString())
)
)
);
movies.Add(
new XElement("Movie",
new XAttribute("Name", "Taken"),
new XAttribute("Year", "2008"),
new XAttribute("RunningTime", 93.ToString()),
new XElement("Director", "Pierre Morel"),
new XElement("Stars",
new XElement("Actor", "Liam Neeson"),
new XElement("Actress", "Maggie Grace")
),
new XElement("DVD",
new XElement("Price", "20.00"),
new XElement("Discount", (0.2).ToString())
),
new XElement("BluRay",
new XElement("Price", "28.00"),
new XElement("Discount", (0.2).ToString())
)
)
);
movies.Add(
new XElement("Movie",
new XAttribute("Name", "Devil"),
new XAttribute("Year", "2010"),
new XAttribute("RunningTime", 80.ToString()),
new XElement("Director", "John Erick Dowdle"),
new XElement("Stars",
new XElement("Actor", "Chris Messina"),
new XElement("Actor", "Bokeem Woodbine"),
new XElement("Actress", "Caroline Dhavernas")
),
new XElement("DVD",
new XElement("Price", "19.00"),
new XElement("Discount", (0.1).ToString())
),
new XElement("BluRay",
new XElement("Price", "26.00"),
new XElement("Discount", (0.2).ToString())
)
)
);
movies.Add(
new XElement("Movie",
new XAttribute("Name", "Pan's Labyrinth"),
new XAttribute("Year", "2006"),
new XAttribute("RunningTime", 119.ToString()),
new XElement("Director", "Guillermo del Toro"),
new XElement("Stars",
new XElement("Actor", "Sergi López"),
new XElement("Actress", "Ivana Baquero"),
new XElement("Actress", "Ariadna Gil")
),
new XElement("DVD",
new XElement("Price", "21.00"),
new XElement("Discount", (0.2).ToString())
),
new XElement("BluRay",
new XElement("Price", "27.00"),
new XElement("Discount", (0.2).ToString())
)
)
);
XDocument doc = new XDocument(
new XDeclaration("1.0", "utf-8", ""),
movies);
doc.Save(@"C:\Temp\Movies1.xml");
Elmax Linq-To-XML node creation
using namespace Elmax;
NewElement movies(L"Movies");
movies.Add(
NewElement(L"Movie",
NewAttribute(L"Name", L"Transformers: Dark of the Moon"),
NewAttribute(L"Year", L"2011"),
NewAttribute(L"RunningTime", ToStr(157)),
NewElement(L"Director", L"Michael Bay"),
NewElement(L"Stars",
NewElement(L"Actor", L"Shia LaBeouf"),
NewElement(L"Actress", L"Rosie Huntington-Whiteley")
),
NewElement(L"DVD",
NewElement(L"Price", L"25.00"),
NewElement(L"Discount", ToStr(0.1))
),
NewElement(L"BluRay",
NewElement(L"Price", L"36.00"),
NewElement(L"Discount", ToStr(0.1))
)
)
);
movies.Add(
NewElement(L"Movie",
NewAttribute(L"Name", L"Taken"),
NewAttribute(L"Year", L"2008"),
NewAttribute(L"RunningTime", ToStr(93)),
NewElement(L"Director", L"Pierre Morel"),
NewElement(L"Stars",
NewElement(L"Actor", L"Liam Neeson"),
NewElement(L"Actress", L"Maggie Grace")
),
NewElement(L"DVD",
NewElement(L"Price", L"20.00"),
NewElement(L"Discount", ToStr(0.2))
),
NewElement(L"BluRay",
NewElement(L"Price", L"28.00"),
NewElement(L"Discount", ToStr(0.2))
)
)
);
movies.Add(
NewElement(L"Movie",
NewAttribute(L"Name", L"Devil"),
NewAttribute(L"Year", L"2010"),
NewAttribute(L"RunningTime", ToStr(80)),
NewElement(L"Director", L"John Erick Dowdle"),
NewElement(L"Stars",
NewElement(L"Actor", L"Chris Messina"),
NewElement(L"Actor", L"Bokeem Woodbine"),
NewElement(L"Actress", L"Caroline Dhavernas")
),
NewElement(L"DVD",
NewElement(L"Price", L"19.00"),
NewElement(L"Discount", ToStr(0.1))
),
NewElement(L"BluRay",
NewElement(L"Price", L"26.00"),
NewElement(L"Discount", ToStr(0.2))
)
)
);
movies.Add(
NewElement(L"Movie",
NewAttribute(L"Name", L"Pan's Labyrinth"),
NewAttribute(L"Year", L"2006"),
NewAttribute(L"RunningTime", ToStr(119)),
NewElement(L"Director", L"Guillermo del Toro"),
NewElement(L"Stars",
NewElement(L"Actor", L"Sergi López"),
NewElement(L"Actress", L"Ivana Baquero"),
NewElement(L"Actress", L"Ariadna Gil")
),
NewElement(L"DVD",
NewElement(L"Price", L"21.00"),
NewElement(L"Discount", ToStr(0.2))
),
NewElement(L"BluRay",
NewElement(L"Price", L"27.00"),
NewElement(L"Discount", ToStr(0.2))
)
)
);
movies.Save(L"C:\\Temp\\Movies2.xml", L"1.0", true);
Elmax node creation
MSXML2::IXMLDOMDocumentPtr pDoc;
HRESULT hr = CreateAndInitDom(pDoc);
if (SUCCEEDED(hr))
{
using namespace Elmax;
Element root;
root.SetConverter(NORMAL_CONV);
root.SetDomDoc(pDoc);
Element movies = root[L"Movies"];
Element movie = movies[L"Movie"].CreateNew();
movie.Attribute(L"Name") = L"Transformers: Dark of the Moon";
movie.Attribute(L"Year") = L"2011";
movie.Attribute(L"RunningTime") = 157;
movie[L"Director"] = L"Michael Bay";
movie[L"Stars|Actor"] = L"Shia LaBeouf";
movie[L"Stars|Actress"] = L"Rosie Huntington-Whiteley";
movie[L"DVD|Price"] = L"25.00";
movie[L"DVD|Discount"] = 0.1;
movie[L"BluRay|Price"] = L"36.00";
movie[L"BluRay|Discount"] = 0.1;
movie = movies[L"Movie"].CreateNew();
movie.Attribute(L"Name") = L"Taken";
movie.Attribute(L"Year") = L"2008";
movie.Attribute(L"RunningTime") = 93;
movie[L"Director"] = L"Pierre Morel";
movie[L"Stars|Actor"] = L"Liam Neeson";
movie[L"Stars|Actress"] = L"Maggie Grace";
movie[L"DVD|Price"] = L"20.00";
movie[L"DVD|Discount"] = 0.2;
movie[L"BluRay|Price"] = L"28.00";
movie[L"BluRay|Discount"] = 0.2;
movie = movies[L"Movie"].CreateNew();
movie.Attribute(L"Name") = L"Devil";
movie.Attribute(L"Year") = L"2010";
movie.Attribute(L"RunningTime") = 80;
movie[L"Director"] = L"John Erick Dowdle";
movie[L"Stars|Actor"] = L"Chris Messina";
movie[L"Stars|Actor"].CreateNew() = L"Bokeem Woodbine";
movie[L"Stars|Actress"] = L"Caroline Dhavernas";
movie[L"DVD|Price"] = L"19.00";
movie[L"DVD|Discount"] = 0.1;
movie[L"BluRay|Price"] = L"26.00";
movie[L"BluRay|Discount"] = 0.2;
movie = movies[L"Movie"].CreateNew();
movie.Attribute(L"Name") = L"Pan's Labyrinth";
movie.Attribute(L"Year") = L"2006";
movie.Attribute(L"RunningTime") = 119;
movie[L"Director"] = L"Guillermo del Toro";
movie[L"Stars|Actor"] = L"Sergi López";
movie[L"Stars|Actress"] = L"Ivana Baquero";
movie[L"Stars|Actress"].CreateNew() = L"Ariadna Gil";
movie[L"DVD|Price"] = L"21.00";
movie[L"DVD|Discount"] = 0.2;
movie[L"BluRay|Price"] = L"27.00";
movie[L"BluRay|Discount"] = 0.2;
SaveXml(pDoc, L"C:\\Temp\\Movies3.xml");
}
This is the XML output.
="1.0"="utf-8"
<Movies>
<Movie Name="Transformers: Dark of the Moon" Year="2011" RunningTime="157">
<Director>Michael Bay</Director>
<Stars>
<Actor>Shia LaBeouf</Actor>
<Actress>Rosie Huntington-Whiteley</Actress>
</Stars>
<DVD>
<Price>25.00</Price>
<Discount>0.1</Discount>
</DVD>
<BluRay>
<Price>36.00</Price>
<Discount>0.1</Discount>
</BluRay>
</Movie>
<Movie Name="Taken" Year="2008" RunningTime="93">
<Director>Pierre Morel</Director>
<Stars>
<Actor>Liam Neeson</Actor>
<Actress>Maggie Grace</Actress>
</Stars>
<DVD>
<Price>20.00</Price>
<Discount>0.2</Discount>
</DVD>
<BluRay>
<Price>28.00</Price>
<Discount>0.2</Discount>
</BluRay>
</Movie>
<Movie Name="Devil" Year="2010" RunningTime="80">
<Director>John Erick Dowdle</Director>
<Stars>
<Actor>Chris Messina</Actor>
<Actor>Bokeem Woodbine</Actor>
<Actress>Caroline Dhavernas</Actress>
</Stars>
<DVD>
<Price>19.00</Price>
<Discount>0.1</Discount>
</DVD>
<BluRay>
<Price>26.00</Price>
<Discount>0.2</Discount>
</BluRay>
</Movie>
<Movie Name="Pan's Labyrinth" Year="2006" RunningTime="119">
<Director>Guillermo del Toro</Director>
<Stars>
<Actor>Sergi López</Actor>
<Actress>Ivana Baquero</Actress>
<Actress>Ariadna Gil</Actress>
</Stars>
<DVD>
<Price>21.00</Price>
<Discount>0.2</Discount>
</DVD>
<BluRay>
<Price>27.00</Price>
<Discount>0.2</Discount>
</BluRay>
</Movie>
</Movies>
History
- 2012-06-06 : Updated with another method to add more than 16 elements under the "Adding Beyond 16 Nodes" section
- 2012-04-10 : Updated the source code (to version 0.84 beta) to include PJ Arends RootElement class and Elmax.h header.
- 2012-04-09 : Updated the source code (to version 0.83 beta) to include PJ Arends fix for missing closing tag (if there is no child elements) for PrettySave and Save methods.
- 2011-11-04 : Updated the source code (to version 0.82 beta) not to use MS XML for saving to reduce memory requirement and improve performance. Fixed PrettySave method as the previous one from MSDN forum does not work. Memory consumption section is removed from article.
- 2011-10-20 : Initial Release