Simple XML Parsing on WinCE 4.2 Using C++ and MSXML 3.0

This article will show you how to parse XML files on WinCE using MSXML 3.0, a XML parser from Microsoft.

Okay, there are many articles around showing you how to parse XML, but not that many for WinCE using Visual C++. A lot of examples are for the .NET environment using Visual Basic .NET and C#. I'd like to share my experience here with parsing XML on Windows CE 4.2. The classes I present can be run on any device that has WinCE4.2 with MSXML 3.0 installed.

Please note that I am using MSXML 3.0 as of writing this article; this is the latest version supported by WinCE 4.2. WinXP/Win2K supports version 4.0 that includes updated interfaces and supports more XPath expressions.

The prerequisites for this article are that you have some experience with XML and the ways in which it can be manipulated using DOM or SAX. This article uses the former. Also, some experience of COM would be helpful.

A Brief Overview of DOM (Document Object Model)

DOM presents an XML document as a tree structure, just like a file hierarchy in Windows Explorer. The tree has a root node, the root node has parent nodes, the parent nodes have child nodes—I think you get the picture. You can refer to these nodes as elements, with other elements embedded inside them. These elements contain text and attributes that are manipulated through this DOM tree. The contents of these elements can be modified or deleted, and we can create new elements.

MSXML—Microsoft's XML Parser

MSXML is based on COM; it comes with Internet Explorer 5 and above. This component has many functions that help you traverse the XML document, access nodes within it, delete these nodes, update these nodes, insert nodes, and more. It is worth noting that MSXML also supports XSLT and XPath. I won't be using these technologies in this article, but just so you know, these are supported.

That was a brief description of DOM and MSXML. If you need to know more, the Internet is a great resource that has many articles on the aforementioned topics.

Initialising MSXML

Now, on to some code. The first thing you need to do to use MSXML is to initialise it, Remember, I mentioned earlier that this is a COM component, so you first need to initialise COM:

HRESULT hr = ::CoInitializeEx( NULL, COINIT_MULTITHREADED );

This will return S_OK if all is well.

You now need to create an instance of the MSXML COM object. I do the following for this:

hr = m_iXMLDoc.CoCreateInstance( __uuidof( DOMDocument ) );

So, what is m_iXMLDoc I hear you ask? It is of type IXMLDOMDocument and represents the top level of the XML document. It is worth pointing out now that WinCE4.2 only supports MSXML 3.0. If you were using Microsoft Windows XP, for example, this supports MSXML 4.0; thus, we could use IXMLDOMDocument4, which supports other methods.

I have wrapped m_iXMLDoc up into a ATL smart pointer. This is done to avoid having to release the object myself (I may forget!); thus, you have:

CComPtr<IXMLDOMDocument> m_iXMLDoc;

If this function call succeeds, it will return S_OK.

The next bit of code looks quite odd but is needed only if you are using Pocket PC:

CComQIPtr<IObjectSafety,&IID_IObjectSafety> iSafety( m_iXMLDoc );
if ( iSafety )
{
   DWORD dwSupported, dwEnabled;
   iSafety->GetInterfaceSafetyOptions( IID_IXMLDOMDocument,
                                       &dwSupported, &dwEnabled

);
  iSafety->SetInterfaceSafetyOptions( IID_IXMLDOMDocument,
                                      dwSupported, 0 );
}

This was taken off the Internet. I can't remember from where, so apologies to the person who wrote this, but without it, things don't seem to work; it is needed to mark the MSXML control as safe.

Loading the XML

You now have initialised COM and created the MSXML object; this in turn now lets us use the functionality supplied by the MSXML object. I am now going to load a very basic XML file. It looks like the following:

<CustomerList>
   <Customers>
      <customer name="OnLineGolf" tag="OLG"/>
      <customer name="BetterGolf" tag="BG"/> 
   <Customers/>
<CustomerList/>

It's very simple. There isn't a lot to this, just a couple of elements and a couple of attributes. So, load this document. This is done with the following code:

VARIANT_BOOL bSuccess=false;
hr = m_iXMLDoc->load( CComVariant ( szXMLFile ), &bSuccess );

szXMLFile is the name of the XML file you want to load. Please note that this could easily be a file that resides on a Web server; thus, you could specify a file URL. bSuccess will contain true if all is well.

Now, before moving on to the next bit of code, I need to introduce you to a useful function I wrote:

void CCEXML::DisplayChildren( IXMLDOMElement* pParent )

This function is going to traverse an element/node recursively. It looks like this:

void CCEXML::DisplayChildren( IXMLDOMElement* pParent )
{
   static IXMLDOMNode* pNextSib = NULL;
   static IXMLDOMNode* pChild   = NULL;

   if( pParent == NULL )    // Finished child
   {
      return;
   }

   DisplayChild( pParent );

   do
   {
      pNextSib = pChild;
      pParent->get_firstChild( &pChild );
      if( pChild == NULL )
         pNextSib->get_nextSibling( &pChild );

      DisplayChildren( (IXMLDOMElement*)pChild );
   }
   while( pChild != NULL );

}

Within this function is another important function:

void CCustomerXML::DisplayChild(IXMLDOMElement* pChild)

This is a pure virtual function. This means the user of your class needs to implement this function, More on this later.

Okay, back to the code. If the document has been loaded successfully, you can start to traverse it. This is done with the following piece of code:

CComPtr<IXMLDOMElement> iRooterElm;
hr = m_iXMLDoc->get_documentElement( &iRooterElm );
if( FAILED( hr ) || iRooterElm == NULL )    // Empty xml file
{
   MessageBox( NULL, L"Empty document!", L"Error Loading XML",
               MB_ICONSTOP );
   return FALSE;
}

IXMLDOMNode* iNode    = NULL;
IXMLDOMNodeList* List = NULL;
iRooterElm->get_childNodes( &List );

long Amount;
List->get_length( &Amount );
for( int i = 0; i < Amount; i++ )
{
   List->get_item( i, &iNode );
   DisplayChildren( (IXMLDOMElement*)iNode );
}

By using this and the DisplayChildren function, the whole of the document is traversed.

Simple XML Parsing on WinCE 4.2 Using C++ and MSXML 3.0

Manipulating the XML

Up to now, you have created the functionality to traverse the whole XML document, but what about manipulating the document? You don't just want to traverse the document. You actually want to do something with it!

Remember the DisplayChild function I mentioned briefly earlier? For those who have forgotten, it is used within the DisplayChildren function and it is a pure virtual function that needs to be implemented. This function does the manipulation; hence the use of a pure virtual function that allows the user to do what they want with the element passed to it. The following DisplayChild function is a demonstration for the XML I showed you earlier in this article. Because the XML was about customers, I have created a class called CCustomerXML that derives from your main XML class. This is how I manipulate the XML file:

void CCustomerXML::DisplayChild(IXMLDOMElement* pChild)
{

   BSTR nodeType, nodeName;
   pChild->get_nodeTypeString( >nodeType );
   pChild->get_nodeName( &nodeName );

   if( wcscmp( nodeName, TEXT("customer") ) == 0 )
   {
      CString strAttrib = ParseXML( pChild );
      if( !strAttrib.IsEmpty() )
      {
         CString strAtt;
         int item = 0;
         do
         {
            strAtt = GetAttributes( strAttrib );
            if( !strAtt.IsEmpty() )
            {
               switch( item )
               {
               case 0:    // customer name
                  AfxMessageBox( strAtt );
                  break;
               case 1:    // customer tag
                  AfxMessageBox( strAtt );

                  break;
               }
               item ++;
            }
         }
         while( !strAtt.IsEmpty() );
      }
   }
}

The first check is to see whether this node is actually called "customer." If it is, you start to manipulate the XML. You use another helper function, called ParseXML, here. The function is shown below:

CString CCEXML::ParseXML(IXMLDOMElement *node)
{
   HRESULT hr = S_OK;
   BSTR nodeName, nodeValue;
   CComBSTR cName, cValue;

   CString strValue;

   IXMLDOMNamedNodeMap* namedNodeMap = NULL;
   hr = node->get_attributes( &namedNodeMap );
   if ( SUCCEEDED( hr ) && namedNodeMap != NULL ) {
      long listLength;
      hr = namedNodeMap->get_length( &listLength );
      for(long i = 0; i < listLength; i++) {
         IXMLDOMNode* listItem = NULL;
         hr = namedNodeMap->get_item( i, &listItem );

         // node name
         listItem->get_nodeName( &nodeName );

         // node value ie. release="v1.0" value is v1.0
         CComVariant nodeVal;
         nodeVal.InternalClear();
         cValue.Empty();
         cName.Empty();
         listItem->get_nodeValue( &nodeVal );
         nodeValue = nodeVal.bstrVal;

         cName.AppendBSTR( nodeName );
         cValue.AppendBSTR( nodeValue );
         CString strXML( cName );
         CString strVal( cValue );
         CString s;
         s.Format( L"%s:%s", strXML, strVal );
         strValue += strVal + L",";
      }
   }

   return strValue;
}

Briefly, what this function does is return the attributes associated with the node as one comma-separated string. In the example, it would return OnLineGolf,OLG. The code then uses another helper function, "GetAttributes," which returns each token within this string. Thus, it would first return "OnLineGolf" and then next time around "OLG."

In my example function, I have a switch statement to know which attribute of the element I am dealing with. Remember, this function is written for the XML supplied, so I know the attributes are "name" and "tag." If I added another attribute to the XML, I would just have another case statement for it. In my example function, I just display the element attribute to the screen, but in a real app, you would store them someplace or do something with them.

Fitting It All Together

I have supplied a base class in this article for download. To use it, you will need to derive another class from it and, as mentioned, implement the DisplayChild function. Using the example I have given in this article should give you a start on how to do this. The base class is called CCEXML, a simple derived class. It would look like the following:

class CCustomerXML : public CCEXML
{
public:
   CCustomerXML();
   virtual ~CCustomerXML();

   void DisplayChild( IXMLDOMElement* pChild );
};

All you need to do now is implement the DisplayChild function to match your needs. To start traversing the XML, you just need to make the following call:

InitialiseAndParse( XMLFilename );

This function is found in the main base class. I have shown the main base class below just for information:

class CCEXML
{
public:
   CCEXML();
   virtual ~CCEXML();

   BOOL InitialiseAndParse( LPCTSTR szXMLFile );
   CComPtr<IXMLDOMDocument> CCEXML::CreateEmptyDOMDocument();

   // This is a pure virtual function that needs to be written by
   // the derived class
   virtual void DisplayChild( IXMLDOMElement* pChild ) = 0 { }

   CString ParseXML( IXMLDOMElement *node );
   CString GetAttributes( CString& strAttribs );

   HRESULT DASetAttribute( BSTR bstrName, BSTR bstrValue,
                           IXMLDOMNode* pNode,
                           IXMLDOMDocument* pDocument );
   HRESULT DAAddChild( BSTR name, int nType, IXMLDOMNode** pOut,
                       IXMLDOMNode* pNode,
                       IXMLDOMDocument* pDocument );

   IXMLDOMDocument* GetDocument() { return m_iXMLDoc; }

private:
   void DisplayChildren( IXMLDOMElement* pParent );

protected:
   CComPtr<IXMLDOMDocument> m_iXMLDoc;
};

There are a couple of other functions within here that I have not mentioned; these are DASetAttribute and DAAddChild. What these functions do is pretty self-explanatory. The former sets a nodes attribute to a given value, while the latter adds an element to a given node. You should study these functions; they are very useful! Here is a quick example on adding a node and creating an XML file:

CComVariant vType( NODE_ELEMENT );
IXMLDOMNode* pNoddy, *pOut;

// First, create an Empty document
CComPtr<IXMLDOMDocument> m_iXMLDoc = CreateEmptyDOMDocument();

// Create a node and append to document
// Root node
m_iXMLDoc->createNode( vType, L"Customers", L"", &pNoddy );
m_iXMLDoc->appendChild( pNoddy, &pDetailsNode );

name = L"Details";
DAAddChild( name, NODE_ELEMENT, &pOut, pDetailsNode, m_iXMLDoc );
IXMLDOMNode* pCust;
name = L"Customer";
DAAddChild( name, NODE_ELEMENT, &pCust, pOut, m_iXMLDoc );
// add attribute to marque
CString strCustomer = L"OnLineGolf";
CComBSTR bstrCustomer( strCustomer );
DASetAttribute( L"Name", bstrCustomer, pCust, m_iXMLDoc );

// Save the XML
m_iXMLDoc->save( CComVariant( szSaveXMLFile ) );

This will create the following XML:

<Customers>
   <Details>
      <Customer Name="OnLineGolf"/>
   <Details/>
<Customers/>

You can see that the DAAddChild and DASetAttribute functions are very useful and easy to use.

Summary

I have only scratched the surface of parsing XML with MSXML 3.0 on WinCE. There are lots more you can do with MSXML than I've demonstrated. For example, you may want to use XPath expressions to manipulate your XML. This is a great way to speed up your XML traversing—let XPath do all the work for you. You may want to insert nodes into the DOM tree or transform a node; the list goes on. The best advice here is to look at the help file that comes with MSXML; it is a great reference with examples of how to do this type of thing.



About the Author

Steve Green

I'm a Software Engineer with a company which writes Diagnostic Applications for vehicles. In my spare time ( what I have of it! ) I love playing Golf and Football and spending as much time as I can with my lovely baby son David.

Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Event Date: April 15, 2014 The ability to effectively set sales goals, assign quotas and territories, bring new people on board and quickly make adjustments to the sales force is often crucial to success--and to the field experience! But for sales operations leaders, managing the administrative processes, systems, data and various departments to get it all right can often be difficult, inefficient and manually intensive. Register for this webinar and learn how you can: Align sales goals, quotas and …

  • Corporate e-Learning technology has a long and diverse pedigree. As far back as the 1980s, companies were adopting computer-based training to supplement traditional classroom activities. More recently, rich web-based applications have added streaming audio and video, real-time collaboration and other new tools to the e-Learning mix. At the same time, the growing availability of informal learning tools--a category that includes everything from web searches to social media posts--are having a major impact on …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds