XMLite: Simple DOM-Based XML Parser



Click here for a larger image.

Environment: VC6

Why XMLite?

In my previous project, I needed a simple XML parser. I worked with the jabber server. Becase I have no time, I worked with the jabber client library called jabbercom, a Win32 com-based DLL module. The jabber (http://www.jabber.org) protocol is based on XML. But, that library is not a complete XML parser for my project. In the first place, it couldn't support Korean text (maybe other languages, too) and there is no escape character processing, and no entity encode/decode support. I have to replace the XML parser engine, but I can't use MSXML and expat; they are heavy to install or hard to use. So, I decided to create XMLite. It's not full support for XMLParser, but it is simple and small, so I hope it will help someone.

Using XMLite

Simply, XMLite has two main data structures, 'XNode' and 'XAttr'. XNode stands for XML Element Node and XAttr stands for XML Attribute Node. XNode has child XNodes and its own attributes list, XAttrs. If you see my source code, you'll think it is so easy to understand, and use it. The code is so simple, you can engage it to use, I hope.

1. XML parsing

XMLite can parse one XML tag node plain text, as shown below.

  CString sxml;
  sxml = _T("\
<TAddress desc='book of bro'>\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>
    </TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick>
    </TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick>
    </TPerson>\
<TInformation count='3'/>\
</TAddress>");

  XNode xml;
  xml.Load( sxml );

  AfxMessageBox(xml.GetXML());

The result is the upper (left) picture.

2. Traveling with parsed XML

  CString sxml;
  sxml = _T("\
<TAddressBook description=\"book of bro\">\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>
    </TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick>
    </TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick>
    </TPerson>\
<TInformation count='3'/>\
</TAddressBook>");

  XNode xml;
  xml.Load( sxml );

  int i;
  XNodes childs;

  // DOM tree Childs Traveling
  // method 1: Using GetChildCount() and GetChild()
  // Result: Person, Person, Person, Information
  LPXNode child;
  for( i = 0 ; i < xml.GetChildCount(); i++)
  {
    child = xml.GetChild(i);
    AfxMessageBox( child->GetXML() );
  }

  // method 2: LPXNodes and GetChilds() (same result
  // as with method 1)
  // Result: Person, Person, Person, Information
  childs = xml.GetChilds();
  for( i = 0 ; i < childs.size(); i++)
    AfxMessageBox( childs[i]->GetXML() );

  // method 3: Selected Childs with GetChilds()
  // Result: Person, Person, Person
  childs = xml.GetChilds(_T("Person") );
  for( i = 0 ; i < childs.size(); i++)
  {
    AfxMessageBox( childs[i]->GetXML() );
  }

  // method 4: Get Attribute Vaule of Child
  // Result: 3
  AfxMessageBox( xml.GetChildAttrValue( _T("Information"),
                                        _T("count") ) );
  int count = XStr2Int( xml.GetChildAttrValue( _T("Information"),
                                               _T("count") ));
  ASSERT( count == 3 );

3. DOM modify

  CString sxml;
  sxml = _T("\
<TAddressBook description=\"book of bro\">\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>
    </TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick>
    </TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick>
    </TPerson>\
<TInformation count='3'/>\
</TAddressBook>");

  XNode xml;
  xml.Load( sxml );

  // remove 'bro node'
  LPXNode child_bro = xml.GetChild(0);
  xml.RemoveChild( child_bro );

  AfxMessageBox(xml.GetXML());

Result: There is no bro node.

<TAddressBook description='book of bro' >
  <TPerson type='friend' >
    <Name>Baik,Ji Hoon</Name>
    <Nick>bjh</Nick>
  </TPerson>
  <TPerson type='friend' >
    <Name>Bak,Gun Joo</Name>
    <Nick>dichter</Nick>
  </TPerson>
  <TInformation count='3' />
</TAddressBook>

4. Error handling

XMLite has XML error handling, but it's not full.

  CString serror_xml;
  serror_xml = _T("<XML>\
<NoCloseTag type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>\
</XML>");

  XNode xml;
  PARSEINFO pi;
  xml.Load( serror_xml, &pi );

  if( pi.error_occur )    // is error_occur?
  {
    //result: '<NoCloseTag> ... </XML>' is not well-formed.
    AfxMessageBox( pi.error_string );
    AfxMessageBox( xml.GetXML()  );
  }
  else
    ASSERT(FALSE);

Then, the result is:

'<NoCloseTag> ... </XML>' is not well-formed.

5. Entity and escape char test

XMLite has an escape process. The escape character is

'\'
as it is in C/C++. And it has entity processing. The entity table is shown below:

Special character, Special meaning Entity encoding:

Special character Special meaning Entity encoding
> Begins a tag &gt;
< Ends a tag &lt;
" Quotation mark &quot;
' Apostrophe &apos;
& Ampersand &amp;
  CString sxml;
  sxml = _T("<XML>\
<TAG attr='<\\'asdf\\\">'>asdf</TAG>\
</XML>");

  XNode xml;
  PARSEINFO pi;
  xml.Load( sxml, &pi );

  AfxMessageBox( xml.GetXML()  );

Result:

<XML>
  <TAG attr='<'asdf">' >asdf</TAG>
</XML>

6. Configurate, parse, and display

XMLite can trim when parsing, and add a new line for display (default).

  CString sxml;
  sxml = _T("<XML>\
<TAG attr='   qwer      '>        asdf       </TAG>\
</XML>");

  XNode xml;

  xml.Load( sxml );
  AfxMessageBox( xml.GetXML()  );

  PARSEINFO pi;
  pi.trim_value = true;    // trim value
  xml.Load( sxml, &pi );
  AfxMessageBox( xml.GetXML()  );

  DISP_OPT opt;
  opt.newline = false;    // no new line

  AfxMessageBox( xml.GetXML( &opt )  );

Result: Before

<XML>
  <TAG attr='   qwer      ' >        asdf       </TAG>
</XML>

Result: After

<XML><TAG attr='qwer' >asdf</TAG></XML>

7. Custom entity table

XMLite can customize an entity table for speical parsing and display. You can define a new entity table for customized parsing.

  CString sxml;
  sxml = _T("<XML>\
<TAG attr='&asdf>'></TAG>\
</XML>");

  // customized entity list
static const XENTITY entity_table[] = {
    { '<', _T("&lt;"), 4 } ,
    { '&', _T("&amp;"), 5 }
  };
  XENTITYS entities( (LPXENTITY)entity_table, 2 ) ;

  PARSEINFO pi;
  XNode xml;
  pi.entity_value = true;       // force to use custom entities
  pi.entities = &entities;
  xml.Load( sxml, &pi );
  AfxMessageBox( xml.GetXML()  );

  DISP_OPT opt;
  opt.entities = &entities;
  opt.reference_value = true;   // force to use custom entities

  AfxMessageBox( xml.GetXML( &opt )  );

Downloads

Download demo project - 41 Kb
Download source - 7 Kb


Comments

  • snJihH IL lW uBp NwZz WS

    Posted by TyrnuqRjlg on 02/13/2013 03:30am

    buy tramadol online high von tramadol - buy tramadol from trusted pharmacy

    Reply
  • http://www.codeproject.com/cpp/xmlite.asp - updated...

    Posted by Legacy on 01/20/2004 12:00am

    Originally posted by: bro

    Thanks for regard, All!!
    I posted xmlite on codeproject , too.
    some bug is fixed about '\' and other stuff..
    http://www.codeproject.com/cpp/xmlite.asp

    it's little difficult to update xmlite on codeguru..
    thank you !!

    Reply
  • visual studio.net support

    Posted by Legacy on 01/18/2004 12:00am

    Originally posted by: stofke

    Hello,

    I get a lot of errors while trying to compile on visual studio.net...

    Four times this error occurs:

    c:\Documents and Settings\stofke\Bureaublad\XMLite\XMLite.cpp(840): error C2451: conditional expression of type 'std::vector<_Ty,_Ax>::iterator' is illegal
    with
    [
    _Ty=LPXNode,
    _Ax=std::allocator<LPXNode >
    ]
    No user-defined-conversion operator available that can perform this conversion, or the operator cannot be called

    I guess it has something to do with STL issues...
    But i can't find a solution.

    Does anybody now what to do ? :)

    thx in advance
    Kristof Leroux

    Reply
  • How to store '\' in value

    Posted by Legacy on 12/22/2003 12:00am

    Originally posted by: StXh

    In c/c++ the '\' is escape char so '\\' mean '\'. But in XMlite, '\\' mean null. How to store '\' in value, like path name "c:\program files\XMLiteTest" ?

    Reply
  • The Ill-defined transformation.

    Posted by Legacy on 05/22/2003 12:00am

    Originally posted by: AVB

    Help to do so that given code worked ok.
    Problemma, in that that after repeated conservation in file and boot, file constantly increases.
    Beside most knowledges comes short

    Reply
  • Why wrong works??

    Posted by Legacy on 04/05/2003 12:00am

    Originally posted by: AVB

    //Why wrong works??
    void CTestXMLiteDlg::OnButton1()
    {
    XNode xml;
    LPXNode lpxml;
    lpxml=xml.AppendChild("1","first");
    lpxml=lpxml->AppendChild("2","second");
    lpxml=lpxml->AppendChild("3","third");
    AfxMessageBox(xml.GetXML());
    CString sxml=xml.GetXML();
    if( xml.Load( sxml ) == NULL )
    {
    AfxMessageBox(_T("error"));
    return;
    }
    AfxMessageBox(xml.GetXML());
    }

    Reply
  • Great!

    Posted by Legacy on 03/28/2003 12:00am

    Originally posted by: Magallo

    Simple and clean. I really appreciate that! Great Job! Thank you so much!

    Reply
  • It is very thing which I want for long time

    Posted by Legacy on 01/06/2003 12:00am

    Originally posted by: Ha Hyoung Wook

    I respect you. very good contribution.

    It will used very useful to my program.

    thanks for your contribution .



    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Live Event Date: September 19, 2014 @ 2:00 p.m. ET / 11:00 a.m. PT In response to the rising number of data breaches and the regulatory and legal impact that can occur as a result of these incidents, leading analysts at Forrester Research have developed five important design principles that will help security professionals reduce their attack surface and mitigate vulnerabilities. Check out this upcoming eSeminar and join Chris Sherman of Forrester Research to learn how to deal with the influx of new device …

  • A majority of organizations are operating under the assumption that their network has already been compromised, or will be, according to a survey conducted by the SANS Institute. With many high profile breaches in 2013 occurring on endpoints, interest in improving endpoint security is top-of-mind for many information security professionals. The full results of the inaugural SANS Endpoint Security Survey are summarized in this white paper to help information security professionals track trends in endpoint …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds