XMLite: Simple DOM-Based XML Parser



Click here for a larger image.

Environment: VC6

Why XMLite?

In my previous project, I needed a simple XML parser. I worked with the jabber server. Becase I have no time, I worked with the jabber client library called jabbercom, a Win32 com-based DLL module. The jabber (http://www.jabber.org) protocol is based on XML. But, that library is not a complete XML parser for my project. In the first place, it couldn’t support Korean text (maybe other languages, too) and there is no escape character processing, and no entity encode/decode support. I have to replace the XML parser engine, but I can’t use MSXML and expat; they are heavy to install or hard to use. So, I decided to create XMLite. It’s not full support for XMLParser, but it is simple and small, so I hope it will help someone.

Using XMLite

Simply, XMLite has two main data structures, ‘XNode’ and ‘XAttr’. XNode stands for XML Element Node and XAttr stands for XML Attribute Node. XNode has child XNodes and its own attributes list, XAttrs. If you see my source code, you’ll think it is so easy to understand, and use it. The code is so simple, you can engage it to use, I hope.

1. XML parsing

XMLite can parse one XML tag node plain text, as shown below.

  CString sxml;
  sxml = _T("\
<TAddress desc='book of bro'>\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>
    </TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick>
    </TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick>
    </TPerson>\
<TInformation count='3'/>\
</TAddress>");

  XNode xml;
  xml.Load( sxml );

  AfxMessageBox(xml.GetXML());

The result is the upper (left) picture.

2. Traveling with parsed XML

  CString sxml;
  sxml = _T("\
<TAddressBook description=\"book of bro\">\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>
    </TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick>
    </TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick>
    </TPerson>\
<TInformation count='3'/>\
</TAddressBook>");

  XNode xml;
  xml.Load( sxml );

  int i;
  XNodes childs;

  // DOM tree Childs Traveling
  // method 1: Using GetChildCount() and GetChild()
  // Result: Person, Person, Person, Information
  LPXNode child;
  for( i = 0 ; i < xml.GetChildCount(); i++)
  {
    child = xml.GetChild(i);
    AfxMessageBox( child->GetXML() );
  }

  // method 2: LPXNodes and GetChilds() (same result
  // as with method 1)
  // Result: Person, Person, Person, Information
  childs = xml.GetChilds();
  for( i = 0 ; i < childs.size(); i++)
    AfxMessageBox( childs[i]->GetXML() );

  // method 3: Selected Childs with GetChilds()
  // Result: Person, Person, Person
  childs = xml.GetChilds(_T("Person") );
  for( i = 0 ; i < childs.size(); i++)
  {
    AfxMessageBox( childs[i]->GetXML() );
  }

  // method 4: Get Attribute Vaule of Child
  // Result: 3
  AfxMessageBox( xml.GetChildAttrValue( _T("Information"),
                                        _T("count") ) );
  int count = XStr2Int( xml.GetChildAttrValue( _T("Information"),
                                               _T("count") ));
  ASSERT( count == 3 );

3. DOM modify

  CString sxml;
  sxml = _T("\
<TAddressBook description=\"book of bro\">\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>
    </TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick>
    </TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick>
    </TPerson>\
<TInformation count='3'/>\
</TAddressBook>");

  XNode xml;
  xml.Load( sxml );

  // remove 'bro node'
  LPXNode child_bro = xml.GetChild(0);
  xml.RemoveChild( child_bro );

  AfxMessageBox(xml.GetXML());

Result: There is no bro node.

<TAddressBook description='book of bro' >
  <TPerson type='friend' >
    <Name>Baik,Ji Hoon</Name>
    <Nick>bjh</Nick>
  </TPerson>
  <TPerson type='friend' >
    <Name>Bak,Gun Joo</Name>
    <Nick>dichter</Nick>
  </TPerson>
  <TInformation count='3' />
</TAddressBook>

4. Error handling

XMLite has XML error handling, but it’s not full.

  CString serror_xml;
  serror_xml = _T("<XML>\
<NoCloseTag type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>\
</XML>");

  XNode xml;
  PARSEINFO pi;
  xml.Load( serror_xml, &pi );

  if( pi.error_occur )    // is error_occur?
  {
    //result: '<NoCloseTag> ... </XML>' is not well-formed.
    AfxMessageBox( pi.error_string );
    AfxMessageBox( xml.GetXML()  );
  }
  else
    ASSERT(FALSE);

Then, the result is:

'<NoCloseTag> ... </XML>' is not well-formed.

5. Entity and escape char test

XMLite has an escape process. The escape character is

'\'

as it is in C/C++. And it has entity processing. The entity table is shown below:

Special character, Special meaning Entity encoding:
















Special character Special meaning Entity encoding
> Begins a tag &gt;
< Ends a tag &lt;
Quotation mark &quot;
Apostrophe &apos;
& Ampersand &amp;

  CString sxml;
  sxml = _T("<XML>\
<TAG attr='<\\'asdf\\\">'>asdf</TAG>\
</XML>");

  XNode xml;
  PARSEINFO pi;
  xml.Load( sxml, &pi );

  AfxMessageBox( xml.GetXML()  );

Result:

<XML>
  <TAG attr='<'asdf">' >asdf</TAG>
</XML>

6. Configurate, parse, and display

XMLite can trim when parsing, and add a new line for display (default).

  CString sxml;
  sxml = _T("<XML>\
<TAG attr='   qwer      '>        asdf       </TAG>\
</XML>");

  XNode xml;

  xml.Load( sxml );
  AfxMessageBox( xml.GetXML()  );

  PARSEINFO pi;
  pi.trim_value = true;    // trim value
  xml.Load( sxml, &pi );
  AfxMessageBox( xml.GetXML()  );

  DISP_OPT opt;
  opt.newline = false;    // no new line

  AfxMessageBox( xml.GetXML( &opt )  );

Result: Before

<XML>
  <TAG attr='   qwer      ' >        asdf       </TAG>
</XML>

Result: After

<XML><TAG attr='qwer' >asdf</TAG></XML>

7. Custom entity table

XMLite can customize an entity table for speical parsing and display. You can define a new entity table for customized parsing.

  CString sxml;
  sxml = _T("<XML>\
<TAG attr='&asdf>'></TAG>\
</XML>");

  // customized entity list
static const XENTITY entity_table[] = {
    { '<', _T("&lt;"), 4 } ,
    { '&', _T("&amp;"), 5 }
  };
  XENTITYS entities( (LPXENTITY)entity_table, 2 ) ;

  PARSEINFO pi;
  XNode xml;
  pi.entity_value = true;       // force to use custom entities
  pi.entities = &entities;
  xml.Load( sxml, &pi );
  AfxMessageBox( xml.GetXML()  );

  DISP_OPT opt;
  opt.entities = &entities;
  opt.reference_value = true;   // force to use custom entities

  AfxMessageBox( xml.GetXML( &opt )  );

Downloads


Download demo project – 41 Kb


Download source – 7 Kb

More by Author

Must Read