Introduction to Using the XML DOM from Visual C++

.

Assumptions About the Reader

This article assumes that you are familiar with the basics of what XML is and what it can be used for. If you are new to XML, I would suggest reading one of the many fine tutorials on the subject first and then returning to this document.

Introducing the XML Document Object Model (DOM)

The XML Document Object Model, or DOM, is a very powerful and robust programmatic interface that not only enables you to programatically load and parse an XML file, or document, it also can be used to traverse XML data. Using certain object in the DOM, you can even manipulate that data and then save your changes back to the XML document. A full and comprehensive look at all the DOM's functionality would be impossible in the space provided here. However, in this article, we'll hit on the hight notes of using the DOM to load an XML document and to iterate through its elements.

The key to understanding how to use the DOM is realizing that the DOM exposes (through its COM interface) XML documents as a hierarchical tree of nodes. As an example, take a look at the following sample XML document.

<?xml version="1.0"?>
<autos>
  <manufacturer name="Chevrolet">
    <make name="Corvette">
      <model>2000 Convertible</model>
      <price currency="usd">60,000</price>
      <horsePower>420</horsePower>
      <fuelCapacity units="gallons">18.5</fuelCapacity>
    </make>
  </manufacturer>
  <manufacturer name="Mazda">
    <make name="RX-7">
      <model>test model</model>
      <price currency="usd">30,000</price>
      <horsePower>350</horsePower>
      <fuelCapacity units="gallons">15.5</fuelCapacity>
    </make>
  </manufacturer>
</autos>
The DOM would interpret this document as follows:
  • <Autos> - This is a NODE_ELEMENT (more on this later) and is referred to as the documentElement
  • <Manufacturer>, <Make>, <Model>, <Price> <HorsePower> and <FuelCapacity> - Each one of these is also a NODE_ELEMENT. However, please note that only the top level NODE_ELEMENT, or root node is referred to as the documentElement.
  • currency="usd", units="gallons"- When a NODE_ELEMENT contains an attribute/value pair like this, the value is referred to as a NODE_TEXT
As you will see shortly, there a number of COM components that part of the XML DOM. Here's a list of the some of the more interesting components and their purpose.
  • XMLDOMDocument - The top node of the XML document tree
  • XMLDOMNode - This represents any single node in the XML document tree.
  • XMLDOMNodeList - This is the collection of all XMLDOMNode objects
  • XMLDOMNamedNodeMap
  • - The collection of all the XML document tree attributes

Accessing IE5's XML Support from Visual C++

I'm a firm believer in a tutorial-style, "let's walk through the code" approach so let's get started seeing just what the COM can do for us by cranking up the Visual C++ development environment and writing some code to load an XML document and navigate through its elements.

Create the Visual C++ Project

While we can do this utilizing MFC or ATL, we'll keep things simple (for me at least :) and use MFC. Therefore, perform the following steps to create the test project and incorporate IE5 XML support into your application.
  1. Create a new Visual C++ project called XMLDOMFromVC.
  2. In the MFC AppWizard, define the project as being a dialog-based application.
  3. Once the AppWizard has completed its work, add a call to initialize OLE support by inserting a call to ::AfxOleInit in the application class' InitInstance function. Assuming you named your project the same as mine, your code should now look like this (with the AfOleInit call highlighted here):
    BOOL CXMLDOMFromVCApp::InitInstance()
    {
     AfxEnableControlContainer();
    
     // .. other code
    
     ::AfxOleInit();
    
     // Since the dialog has been closed, return FALSE so that we exit the
     //  application, rather than start the application's message pump.
     return FALSE;
    }
    
  4. At this point, you'll need to import the Microsoft XML Parser typelib (OLE type library). The simplest way to do this is to use the C++ #import directive. Simply open your project's stdafx.h file and add the following lines before the file's closing #endif directive.
    #import <msxml.dll> named_guids
    using namespace MSXML;
    
  5. At this point, we can start declaring some variable to use with the DOM. Open your dialog class' header file (XMLDOMFromVCDlg.h) and add the following smart pointer member variables where the IXMLDOMDocumentPtr is the pointer to the XML document itself and the IXMLDOMElement is a pointer to the XML document root (as explained above).
    IXMLDOMDocumentPtr m_plDomDocument;
    IXMLDOMElementPtr m_pDocRoot;
    
  6. Once you've declared the XML smart pointers, insert the following code in your dialog class' OnInitDialog member function (just before the return statement). This code simply initializes the COM runtime and sets up your XML document smart pointer (m_plDomDocument).
    // Initialize COM
    ::CoInitialize(NULL);
    
    HRESULT hr = m_plDomDocument.CreateInstance(CLSID_DOMDocument);
    if (FAILED(hr))
    {
     _com_error er(hr);
     AfxMessageBox(er.ErrorMessage());
     EndDialog(1);
    }
    

Loading an XML Document

Now that you've done the preliminary work for include XML support into your Visual C++ applications, let's do something useful like actually loading an XML document. To do that, simply add the following code to your dialog (just after the initialization code entered above). I've sprinkled comments through the code to explain what I'm doing each step of the way. I would recommend putting this code into your dialog's OnInitDialog member function.
// specify xml file name
CString strFileName ("XMLDOMFromVC.xml");

// convert xml file name string to something COM can handle (BSTR)
_bstr_t bstrFileName;
bstrFileName = strFileName.AllocSysString();

// call the IXMLDOMDocumentPtr's load function to load the XML document
variant_t vResult;
vResult = m_plDomDocument->load(bstrFileName);
if (((bool)vResult) == TRUE) // success!
{
 // now that the document is loaded, we need to initialize the root pointer
 m_pDocRoot = m_plDomDocument->documentElement;
 AfxMessageBox("Document loaded successfully!");
}
else
{
 AfxMessageBox("Document FAILED to load!");
}
Don't believe it's that easy? Add the following call to have the contents of your entire XML document displayed in a message box.
AfxMessageBox(m_plDomDocument->xml);
Now, build and run the application and you should see results similar to Figure 1.


Loading and displaying an XML document can be done from Visual C++ with just a few lines of code using the DOM.

Ok. Ok. This doesn't really count as reading through an XML document, but I wanted to show you that you had successfully loaded a document and that you can easily get the entire document's contents with a single line of code. In the next section, we'll see how to manually iterate through XML elements.

Iterating Through an XML Document

In this section, we'll learn about a couple of method and properties that you'll use quite often when iterating through a document's elements: IXMLDOMNodePtr::firstChild and IXMLDOMNodePtr::nextSibling.

The following reentrant function shows a way by which you can do this quite easily. In fact, if you insert this code into the dialog's OK button handler it will display each element in your document:
void CXMLDOMFromVCDlg::OnOK() 
{
 // send the root to the DisplayChildren function
 DisplayChildren(m_pDocRoot);
}

void CXMLDOMFromVCDlg::DisplayChildren(IXMLDOMNodePtr pParent)
{
 // display the current node's name
 DisplayChild(pParent);

 // simple for loop to get all children
 for (IXMLDOMNodePtr pChild = pParent->firstChild;
      NULL != pChild;
      pChild = pChild->nextSibling)
 {
  // for each child, call this function so that we get 
  // its children as well
  DisplayChildren(pChild);
 }
}

void CXMLDOMFromVCDlg::DisplayChild(IXMLDOMNodePtr pChild)
{
 AfxMessageBox(pChild->nodeName);
}
If you were to build and run the project at this point, you would definitely notice something peculiar. The first few message boxes will appear as you might expect. The first one displaying the value "autos", followed by by "manufacturerer" and then "make" and finally "model". However, at that point (after the message box displaying the value "Model") things will get a little strange. Instead of a message box displaying the value "price", the value "#text" will be displayed! The reason for this is simple.

Let's look at an excerpt from the XML document:

  ...
  <manufacturer name="Chevrolet">
    <make name="Corvette">
      <model>2000 Convertible</model>
      <price currency="usd">60,000</price>
      <horsePower>420</horsePower>
      <fuelCapacity units="gallons">18.5</fuelCapacity>
    </make>
  </manufacturer>
  ...
As you can see in the highlighted line above, a value succeeds the model tag, These "values" are still treated as nodes in XML when using the IXMLDOMNodePtr::firstChild and IXMLDOMNodePtr::nextSibling methods. Therefore, how do you know what type of node you have?

By using the IXMLDOMNodePtr::nodeType property. Simply modify your dialog's CXMLDOMFromVCDlg::DisplayChild member function based on the highlighted portions below. When you've done that and run the code, you will see the expected values instead of the literal "#text".

void CXMLDOMFromVCDlg::DisplayChild(IXMLDOMNodePtr pChild)
{
 if (NODE_TEXT == pChild->nodeType)
 {
  AfxMessageBox(pChild->text);
 }
 else
 {
  AfxMessageBox(pChild->nodeName);
 }
}
You no doubt also noted the "magic" constant used above (NODE_TEXT). All the node types are defined with an enum in the msxml.tlh file that was generated with the #import directive you used earlier. This enum structure is listed below:
enum tagDOMNodeType
{
    NODE_INVALID = 0,
    NODE_ELEMENT = 1,
    NODE_ATTRIBUTE = 2,
    NODE_TEXT = 3,
    NODE_CDATA_SECTION = 4,
    NODE_ENTITY_REFERENCE = 5,
    NODE_ENTITY = 6,
    NODE_PROCESSING_INSTRUCTION = 7,
    NODE_COMMENT = 8,
    NODE_DOCUMENT = 9,
    NODE_DOCUMENT_TYPE = 10,
    NODE_DOCUMENT_FRAGMENT = 11,
    NODE_NOTATION = 12
};

Summary

In this article, you discovered the XML DOM and learned how to access its features from Visual C++ / COM. The demo we built illustrated the following basic DOM functions:
  • Loading an XML document
  • Iterating through a document's nodes
  • Determining a node's type
  • Displaying NODE_TEXT node values
There is obviously much more to DOM than what you've seen here, but hopefully what you've learned will whet your appetite to dig into the documenation and to see all the great things you can do with XML documents using the DOM.

Downloads

Download demo project - 15 Kb


About the Author

Tom Archer - MSFT

I am a Program Manager and Content Strategist for the Microsoft MSDN Online team managing the Windows Vista and Visual C++ developer centers. Before being employed at Microsoft, I was awarded MVP status for the Visual C++ product. A 20+ year veteran of programming with various languages - C++, C, Assembler, RPG III/400, PL/I, etc. - I've also written many technical books (Inside C#, Extending MFC Applications with the .NET Framework, Visual C++.NET Bible, etc.) and 100+ online articles.

Comments

  • ugg reckon up on uk

    Posted by Bensonhwk on 10/28/2012 02:26pm

    everythingnfl nike jerseys cheaphappenscheap nfl football jerseyscheatedugg australia italiamemoryugg sale clearancewrongcheap uggeverythingugg australia uksweetugg boots uk stockistswould

    Reply
  • retrieving attributes

    Posted by dacky on 09/26/2009 09:52am

    How do we retrieve the attributes of a xml file? Please help, thanks

    Reply
  • good

    Posted by bmplongxing on 11/15/2007 04:01am

    Very nice!

    Reply
  • Resolved the Issues in VC++ 7.1

    Posted by njolly on 10/12/2007 07:55am

    Atlast after doing so many rnd's the issues are resolved. Work Around: I checked with modifying #import statement with mxxml.dll, msxml2.dll, msxml3.dll, msxml4.dll dll's and namespace with MSXML and MSXML2. Solution: 1. you can use the msxml with MSXML namespace declaration like: #import named_guids using namespace MSXML; this will work in VC++ 7.1 only if use the namespace name at each place where you are using the XML varaible declarations etc. like: void DisplayXmlDoc(MSXML::IXMLDOMNodePtr pRootNode); void DisplayChild(MSXML::IXMLDOMNodePtr pChildNode); MSXML::IXMLDOMDocumentPtr m_pXDomDocument; MSXML::IXMLDOMElementPtr m_pXDomRootElement; even you should use the namespace name with NODE_TEXT macro like: MSXML::NODE_TEXT Now it works fine. Thanks! It is really nice article, that helped me today. and now it is working with VC++ 7.1 also.

    Reply
  • Errors in VC++ 7.1, pls. update it.

    Posted by njolly on 10/12/2007 03:06am

    I check it by using MSXML2 instead of MSXML but it doesn't work. It is compiling well and working perfectly fine in VC++ 6.0 but it's not working in VC++ 7.1. If remove the code line "using namespace MSXML" it compiles. But i am facing other error when i am using XDOM functions like : "m_pXmlDomDocument.load() function does not take 1 arguments". but in function-tool-tip it's showing only 1 parameter. Pls. update the article...

    Reply
  • build error c2872

    Posted by yachli on 11/01/2006 12:19am

    I cannot build it and the error is XMLDOMFromVCDlg.cpp(230) : error C2872: 'NODE_TEXT' : ambiguous symbol how can I do?

    • try msxml3.dll

      Posted by yachli on 11/01/2006 12:31am

      then modify it like this #import using namespace MSXML2; and replace all MSXMl:: to MSXML2:: error occured: releaseminsize\msxml3.tli(49) : error C2872: 'DOMNodeType' : ambiguous symbol same error as my other codes.what should i do?

      Reply
    • it's ok now

      Posted by yachli on 11/01/2006 12:26am

      I modify NODE_TEXT to MSXML::NODE_TEXT, everything is ok. it's a great sample,thank you!

      Reply
    Reply
  • Excellent Article -- update for VC++ 7.1?

    Posted by KentB on 12/07/2004 03:17pm

    This article is excellent. It compiles and runs with no problem on Visual C++ version 6.0. With Version 7.1 (part of Visual Studio .NET 2003), there are several "ambiguous symbol" errors when compiling -- perhaps because some of these symbols are now defined by the compiler or included SDK. It would be great if the included code was updated for VC++ 7.1.

    • Need to use namespace

      Posted by Tom Archer on 05/03/2005 11:24am

      I recently needed to use some of this code for the first time in a very long time and this person's comments fixed the compiler errors you're getting (as a result of the Platform SDK): http://www.codeguru.com/Cpp/misc/misc/comments.php/c3707/?thread=3099

      Reply
    Reply
  • Nice article

    Posted by rodon on 10/14/2004 04:58pm

    Well written. Thanks.

    • Possible update and MC++ article

      Posted by Tom Archer on 10/14/2004 06:26pm

      Thanks Eric. I'm thinking of updating this article and even writing a MC++ version. Would you find that useful?

      Reply
    Reply
  • Getting rid of compiler errors under VC7.1

    Posted by kingbolete on 08/19/2004 04:28pm

    When compiling under VC++ .Net03 use
    
    #import  named_guids
    using namespace MSXML2;
     
    as the only import.  Change any MSXML:: to MSXML2::.  
    To get rid of ambiuous symbols, prefix them with MSXML2::.

    Reply
  • commentes

    Posted by Legacy on 01/30/2004 12:00am

    Originally posted by: hamid

    kute hindu i read your web page you creat a good sight but keep one thing in mind that you never ever can compitation with me even in your seven JANIMS because i am your father
    
    and keep in mind my son.

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Event Date: April 15, 2014 The ability to effectively set sales goals, assign quotas and territories, bring new people on board and quickly make adjustments to the sales force is often crucial to success--and to the field experience! But for sales operations leaders, managing the administrative processes, systems, data and various departments to get it all right can often be difficult, inefficient and manually intensive. Register for this webinar and learn how you can: Align sales goals, quotas and …

  • Live Event Date: August 14, 2014 @ 2:00 p.m. ET / 11:00 a.m. PT Data protection has long been considered "overhead" by many organizations in the past, many chalking it up to an insurance policy or an extended warranty you may never use. The realities of today makes data protection a must-have, as we live in a data-driven society -- the digital assets we create, share, and collaborate with others on must be managed and protected for many purposes. Check out this upcoming eSeminar and join Seagate Cloud …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds