WEBINAR:
On-Demand
Desktop-as-a-Service Designed for Any Cloud ? Nutanix Frame
Library usage
In the section, we'll look at how to use Elmax library to perform creation, reading, update and deletion (CRUD) on elements, attributes, CData sections and comments. As you can see from the previous code sample that Elmax makes use of Microsoft XML DOM library. That's because I do not wish to re-create all that XML functionality, for instance, XPath. Since Elmax depends on Microsoft XML which in turn depends on COM to work, we have to call CoInitialize(NULL); to initialize COM runtime at the start of the application and also call CoUninitialize(); to uninitialize it before the application ends. Elmax is an abstraction over DOM, however, it does not seek to replicate all the functionality of DOM. For example, programmer cannot use Elmax to read element siblings. In Elmax model, element is 1st class citizen. Attribute, CData section and comment are children of a element! This is different from the DOM where they are nodes in their own right. The reason I designed CData section and comment to be children of element, is because CData section and comment are not identifiable by name or ID.
Element creation
Element all = root[L"All"];
all[L"Version"] = 1;
Element books = all[L"Books"].CreateNew();
Typically, we use CreateNew to create elements. There is also a Create method. The difference is the Create method will not create the elements if they already exist. Notice that I did not use Create or CreateNew to create All and Version elements? That's because they are created automatically when I assign a value to the last element on the chain. Note that when you call CreateNew repeatedly, only the last element gets created. Let me show you an code example to explain this.
root[L"aa"][L"bb"][L"cc"].CreateNew();
root[L"aa"][L"bb"][L"cc"].CreateNew();
root[L"aa"][L"bb"][L"cc"].CreateNew();
In the 1st CreateNew call, elements "aa", "bb" and "cc" are created. In each subsequent call, only element cc is created. This is the resultant XML created (and indented for easy reading).
<aa>
<bb>
<cc/>
<cc/>
<cc/>
</bb>
</aa>
Create and CreateNew has an optional parameter to specify the namespace URI. If your element belongs to a namespace, then you must create it explicitly, using Create or CreateNew; it means you cannot rely on value assignment to create it automatically. More on this later. Note: calling instance Element methods other than Create, CreateNew, setters and accessors when the element(s) do not exists, Elmax will raise an exception! When do we use Create instead of CreateNew? One possible scenario is the application load a XML file, edits it and saves it. In the edit stage, it is not check if a element exists in the original XML file before assigning it or adding nodes: Call Create which will create it if not exists, otherwise Create does nothing.
Element Removal
using namespace Elmax;
Element elem;
Element elemChild = elem[L"Child"];
// do processing
elem.RemoveNode(elemChild); // Remove its child node.
elem.RemoveNode(); // Remove itself from DOM.
Note: for AddNode method, you can only add node which has been removed in the current version.
Element Value Assignment
In the begining of the article, I showed how to create elements and assign a value to the last element at the same time. I'll repeat that code snippet here.
Elmax::Element root;
root.SetDomDoc(pDoc); // A empty DOM doc is initialized beforehand.
root[L"Books"][L"Book"][L"Price"] = 12.99f;
It turns out that this example is dangerous as it use overloaded assignment operator determined by the compiler. What if you mean to assign a float but assign a integer instead just because you forgot to add a ".0" and append a 'f' to the float value? Not much harm in this case, I suppose. In all scenarios, it is better to use the setter method to assign value explicitly.
Elmax::Element root;
root.SetDomDoc(pDoc); // A empty DOM doc is initialized beforehand.
root[L"Books"][L"Book"][L"Price"].SetFloat(12.99f);
Here is the list of setter methods available.
bool SetBool(bool val);
bool SetChar(char val);
bool SetShort(short val);
bool SetInt32(int val);
bool SetInt64(__int64 val);
bool SetUChar(unsigned char val);
bool SetUShort(unsigned short val);
bool SetUInt32(unsigned int val);
bool SetUInt64(unsigned __int64 val);
bool SetFloat(float val);
bool SetDouble(double val);
bool SetString(const std::wstring& val);
bool SetString(const std::string& val);
bool SetGUID(const GUID& val);
bool SetDate(const Elmax::Date& val);
bool SetDateTime(const Elmax::DateAndTime& val);
Element Value Reading
In the beginning of the article, I showed how to read a value from element. I'll repeat the code snippet here.
Elmax::Element root;
root.SetDomDoc(pDoc); // A XML file is read into the DOM doc beforehand.
Elmax::Element elemPrice = root[L"Books"][L"Book"][L"Price"];
if(elemPrice.Exists())
float price = elemPrice;
This is the more correct version, using the GetFloat accessor to specify a default value.
Elmax::Element root;
root.SetDomDoc(pDoc); // A XML file is read into the DOM doc beforehand.
Elmax::Element elemPrice = root[L"Books"][L"Book"][L"Price"];
if(elemPrice.Exists())
float price = elemPrice.GetFloat(10.0f);
Price will get a default value of 10.0f if the value does not exist or is invalid whereas the prior example before this example, will get a 0.0f because default value is not specified. But by default, Elmax does not know the string value is a improper float value in textual form, unless you use regular expression to validate the string value. Set REGEX_CONV instead of NORMAL_CONV in the root element to use regular expression type converter. As an alternative, you can use schema or DTD to validate your XML before doing Elmax parsing. To learn schema or DTD validation, please consult your favorite MSDN.
Elmax::Element root;
root.SetConverter(REGEX_CONV);
This is the declaration of SetConverter method.
//! Set the type converter pointer
void SetConverter(CONVERTER conv, IConverter* pConv=NULL);
To use your own custom type converter, set the optional pConv pointer.
Elmax::Element root;
root.SetConverter(CUSTOM_CONV, pCustomTypeConv);
You are reponsible for the deletion of pCustomTypeConv if it is allocated on heap. There are locale type converters in Elmax but they are not tested at this point because I am not sure how to test them, as in Asia, number representation are the same in different countries, unlike in Europe. As a tip to the readers who might be modifying Elmax, remember to run through all the 252 unit tests to make sure you did not break anything after modification. The unit test is only available for run in Visual Studio 2010. Below is a list of value accessors available.
bool GetBool(bool defaultVal) const;
char GetChar(char defaultVal) const;
short GetShort(short defaultVal) const;
int GetInt32(int defaultVal) const;
__int64 GetInt64(__int64 defaultVal) const;
unsigned char GetUChar(unsigned char defaultVal) const;
unsigned short GetUShort(unsigned short defaultVal) const;
unsigned int GetUInt32(unsigned int defaultVal) const;
unsigned __int64 GetUInt64(unsigned __int64 defaultVal) const;
float GetFloat(float defaultVal) const;
double GetDouble(double defaultVal) const;
std::wstring GetString(const std::wstring& defaultVal) const;
std::string GetString(const std::string& defaultVal) const;
GUID GetGUID(const GUID& defaultVal) const;
Elmax::Date GetDate(const Elmax::Date& defaultVal) const;
Elmax::DateAndTime GetDateTime(
const Elmax::DateAndTime& defaultVal) const;
For GetBool and the interpretation of boolean value, "true", "yes", "ok" and "1" evaluate to be true while "false", "no", "cancel" and "0" evaluate to be false. They are not case-sensitive.
Namespace
To create a element under a namespace URI, "http://www.yahoo.com", see below,
using namespace Elmax;
Element all = root[L"All"];
all[L"Version"] = 1;
Element books = all[L"Books"].CreateNew();
Element book1 = books[L"Book"].CreateNew(L"http://www.yahoo.com");
The XML output is as below,
<?xml version="1.0" encoding="UTF-8"?>
<All>
<Version>1</Version>
<Books>
<Book xmlns="http://www.yahoo.com"/>
</Books>
</All>
To create a bunch of elements and attribute under a namespace URI, see below,
using namespace Elmax;
Element all = root[L"All"];
all[L"Version"] = 1;
Element books = all[L"Books"].CreateNew();
Element book1 = books[L"Yahoo:Book"].CreateNew(L"http://www.yahoo.com");
book1.Attribute(L"Yahoo:ISBN").Create(L"http://www.yahoo.com");
book1.Attribute(L"Yahoo:ISBN") = L"1111-1111-1111";
book1[L"Yahoo:Title"].Create(L"http://www.yahoo.com");
book1[L"Yahoo:Title"] = L"How not to program!";
book1[L"Yahoo:Price"].Create(L"http://www.yahoo.com");
book1[L"Yahoo:Price"] = 12.99f;
book1[L"Yahoo:Desc"].Create(L"http://www.yahoo.com");
book1[L"Yahoo:Desc"] = L"Learn how not to program from the industry's
worst programmers! Treat it as inverse education.";
book1[L"Yahoo:AuthorID"].Create(L"http://www.yahoo.com");
book1[L"Yahoo:AuthorID"] = 111;
The XML output is as below,
<All>
<Version>1</Version>
<Books>
<Yahoo:Book xmlns:Yahoo="http://www.yahoo.com"
Yahoo:ISBN="1111-1111-1111">
<Yahoo:Title>How not to program!</Yahoo:Title>
<Yahoo:Price>12.990000</Yahoo:Price>
<Yahoo:Desc>Learn how not to program from the industry's
worst programmers! Treat it as inverse education.</Yahoo:Desc>
<Yahoo:AuthorID>111</Yahoo:AuthorID>
</Yahoo:Book>
</Books>
</All>
Enumerating same elements
You can use the AsCollection method to get siblings with the same name in a vector.
using namespace Elmax;
Element root;
root.SetConverter(NORMAL_CONV);
root.SetDomDoc(pDoc);
Element elem1 = root[L"aa|bb|cc"].CreateNew();
elem1.SetInt32(11);
Element elem2 = root[L"aa|bb|cc"].CreateNew();
elem2.SetInt32(22);
Element elem3 = root[L"aa|bb|cc"].CreateNew();
elem3.SetInt32(33);
Element::collection_t vec = root[L"aa"][L"bb"][L"cc"].AsCollection();
for(size_t i=0;i<vec.size(); ++i)
{
int n = vec.at(i).GetInt32(10);
}
This overloaded form (below) of AsCollection is faster as it does not create a temporary vector before returning.
bool AsCollection(const std::wstring& name, collection_t& vec);
Enumerating same child elements
You can use the GetCollection method to get children with the same name in a vector.
using namespace Elmax;
Element root;
root.SetConverter(NORMAL_CONV);
root.SetDomDoc(pDoc);
Element elem1 = root[L"aa|bb|cc"].CreateNew();
elem1.SetInt32(11);
Element elem2 = root[L"aa|bb|cc"].CreateNew();
elem2.SetInt32(22);
Element elem3 = root[L"aa|bb|cc"].CreateNew();
elem3.SetInt32(33);
Element::collection_t vec = root[L"aa"][L"bb"].GetCollection(L"cc");
for(size_t i=0;i<vec.size(); ++i)
{
int n = vec.at(i).GetInt32(10);
}
This overloaded form (below) of GetCollection is faster as it does not create a temporary vector before returning.
bool GetCollection(const std::wstring& name, collection_t& vec);
Query number of children
To query the number of children for each name, you can use QueryChildrenNum method.
using namespace Elmax;
Element root;
root.SetConverter(NORMAL_CONV);
root.SetDomDoc(pDoc);
Element elem1 = root[L"aa|bb|qq"].CreateNew();
elem1.SetInt32(11);
Element elem2 = root[L"aa|bb|cc"].CreateNew();
elem2.SetInt32(22);
Element elem3 = root[L"aa|bb|cc"].CreateNew();
elem3.SetInt32(33);
Element elem4 = root[L"aa|bb|qq"].CreateNew();
elem4.SetInt32(44);
Element elem5 = root[L"aa|bb|cc"].CreateNew();
elem5.SetInt32(55);
Element::available_child_t acmap =
root[L"aa"][L"bb"].QueryChildrenNum();
assert(acmap[L"cc"] == (unsigned int)(3));
assert(acmap[L"qq"] == (unsigned int)(2));
There is also an overloaded form (below) of QueryChildrenNum which does not create a temporary vector before returning. Note: QueryChildrenNum can only query for elements, not attributes or CData sections or comments.
typedef std::map< std::wstring, size_t > available_child_t;
bool QueryChildrenNum(available_child_t& children);
Shortcut to avoid temporary element creation
In the previous enumeration example, I used
Elmax::Element elem1 = root[L"aa|bb|cc"].CreateNew();
instead of
Elmax::Element elem1 = root[L"aa"][L"bb"][L"cc"].CreateNew();
because the 2nd form creates temporary elements, "aa" and "bb" on the stack which are not used. The 1st form saves some tedious typing and only returns 1 element in the overloaded [] operator, not to say it is faster too. '\\' and '/' can be used for delimiters as well. To speed up the below code which excessively use temporaries,
if(root[L"aa"][L"bb"][L"cc"][L"dd"][L"ee"].Exists())
{
root[L"aa"][L"bb"][L"cc"][L"dd"][L"ee"][L"Title"] = L"Beer jokes";
root[L"aa"][L"bb"][L"cc"][L"dd"][L"ee"][L"Author"] = L"The joker";
root[L"aa"][L"bb"][L"cc"][L"dd"][L"ee"][L"Price"] = 10.0f;
}
you can assign it to a Element variable, and use that variable instead.
Elmax::Element elem1 = root[L"aa|bb|cc|dd|ee"];
if(elem1.Exists())
{
elem1[L"Title"] = L"Beer jokes";
elem1[L"Author"] = L"The joker";
elem1[L"Price"] = 10.0f;
}
Root element
Root element is created when you call SetDomDoc on the element. You should know, by now, that the [] operator is used to access the child element. For root element, the [] operator accesses itself to see it's name correspond to the name in the [] operator.
Element root;
root.SetDomDoc(pDoc);
Element elem1 = root[L"aa|bb|cc"];
The "aa" element in the above example actually refers to the root, not the child of root. If a element is not called with SetDomDoc(), then "aa" refers to its child. When using the [] operator, please remember to prefix the (wide) string literal with 'L', eg, elem[L"Hello"] else you will get a strange unhelpful error. Elements are created directly oir indirectly from the root. For example, root create the "aa" element and the "aa" element has the ability to create other elements. If you instantiate your element not from the root, your element cannot create. This is the limitation of the MS XML DOM which only the DOM document create nodes. Those Elements which created directly or indirectly from root, received their DOM document, thus the ability to create Elements.
Shared State in Multithreading
You might be using different Elmax Element objects in different threads without sharing them across threads. However, Element has static type converter objects which are shared with all Element objects. To overcome this problem, allocate a new type converter and use that in the root. Remember to delete the converter after use.
using namespace Elmax;
Element root;
root.SetDomDoc(pDoc);
RegexConverter* pRegex = new RegexConverter();
root.SetConverter(CUSTOM_CONV, pRegex);
By the way, remember to call CoInitialize/CoUninitialize in your worker threads!
Save File Contents in XML
You can call SetFileContents to save a file's binary contents in Base64 format in the Element. You can specify to save its file name and file length in the attributes if you intended to save the contents back to a file with a same name on disk. We need to save the original file length as well because GetFileContents sometimes reported a longer length after the Base64 conversion!
bool SetFileContents(const std::wstring& filepath,
bool bSaveFilename,
bool bSaveFileLength);
We use GetFileContents to get back the file content from Base64 conversion. filename is written, provided that you did specify to save the file name during SetFileContents. length is the length of the returned character array, not the saved file length attribute.
char* GetFileContents(std::wstring& filename, size_t& length);
Attribute
To create attribute (if not exists) and assign a string to it, see example below.
book1.Attribute(L"ISBN") = L"1111-1111-1111";
To create attribute with a namespace URI and assign a string to it, you have to create it explicitly.
book1.Attribute(L"Yahoo:ISBN").Create(L"http://www.yahoo.com");
book1.Attribute(L"Yahoo:ISBN") = L"1111-1111-1111";
To delete an attribute, use Delete method.
book1.Attribute(L"ISBN").Delete();
To find out a attribute with the name exists, use Exists method.
bool bExists = book1.Attribute(L"ISBN").Exists();
The list of Attribute setters and accessors are the same as Element. And they use the same type converter.
Comments
For your information, XML comment come in the form of <!--My example comments here--> Below are a bunch of operations you can use with comments.
using namespace Elmax;
Element elem = root[L"aa"][L"bb"][L"cc"].CreateNew();
elem.AddComment(L"Can you see me?"); // add a new comment!
Comment comment = elem.GetComment(0); // get comment at 0 index
comment.Update(L"Can you hear me?"); // update the comment
comment.Delete(); // Delete this comment node!
You can get a vector of Comment objects which are children of the element, using GetCommentCollection method.
CData section
For your information, XML CData section come in the form of <![CDATA[" <IgnoredInCDataSection/> "]]>. XML CData section typically contains data which is not parsed by the parsers, therefore it can contains < and > and other invalid text characters. Some programmers prefers to store them in Base64 format(See next section). Below are a bunch of operations you can use with CData sections.
using namespace Elmax;
Element elem = root[L"aa"][L"bb"][L"cc"].CreateNew();
elem.AddCData(L"<<>>"); // add a new CData section!
CData cdata = elem.GetCData(0); // get CData section at 0 index
cdata.Update(L">><<"); // update the CData section
cdata.Delete(); // Delete this CData section node!
You can get a vector of CData sections which are children of the element, using GetCDataCollection method.
Base64
Some programmers prefer to store binary data in the Base64 format under 1 element, instead of CData section, to easily identify and find it. The downside is Base64 format takes up more space and data conversion takes time. The code example shows how to use Base64 conversion before assignment, and also to convert back from Base64 to binary data after reading.
Elmax::Element elem1;
string strNormal = "@#$^*_+-|\~<>";
// Assigning base64 data
elem1 = Element::ConvToBase64(strNormal.c_str(), strNormal.length());
// Reading base64 data
wstring strBase64 = elem1.GetString(L"ABC");
size_t len = 0;
// Get the length required
Element::ConvFromBase64(strBase64, NULL, len);
char* p = new char[len+1];
memset(p, 0, len+1);
Element::ConvFromBase64(strBase64, p, len);
// process p here (not shown)(Remember to delete p).
C++0x move constructor
Elmax library defines some C++0x move constructors and move assignments. In order to build the library in older Visual Studio prior to the 2010 version, you have to hide them by defining _HAS_CPP0X to be 0 in the stdafx.h.
What is in the Elmax name?
The abstraction model and the library is named "Elmax" because there is a 'X', 'M' and 'L' in "Elmax". <whisper>I can tell you the real reason but you must not tell anyone, else I have to eliminate you from this world! The reason is the author likes to crack jokes in real life. But all his jokes are deemed by everyone to be lame and cold. In Chinese language, cold joke mean joke which is not funny or laughable at all! If you rearrange alphabets in "Elmax", you get "LameX" which refers to the author!</whisper>
What's next?
In the next article, the XML parsing is going to get even easier! That is, parsing is eliminated; the programmer does not have to do the XML parsing himself/herself! XML parsing is done automatically, along the lines of Object Relational Mapping (ORM). I personally don't see the need for programmer to do XML parsing. Just pass in a specially formatted structure(s) with an XML file and the library will fill in the structure for you! Just treat that I am kidding! There is no way I'll have time for this as my part-time undergrad course is starting soon!
Thanks for reading!
Bug reports
For bug reports and feature requests, please file them here. When you file a bug report, please do include the sample code and xml file (if any) to reproduce the bug. The current Elmax is at version 0.6 Beta. It's codeplex site is located at http://elmax.codeplex.com/
History
22/12/2010 : 1st release
References
Base64 conversion class used in Elmax is from Jan Raddatz's article on Codeguru: BASE 64 Decoding and Encoding Class