An XML Collection

Environment: Visual C/C++

Introduction

Have you ever dreamed about a simple MFC-based collection to handle an ANSI XML file? Well, that's a possibe solution.

CXXMLFile : (Class) eXtended XML File

This class reads a file with very little information. It assumes the HTML/XML conventions in symbols and tags. But, the tag <? .... ?> now is ignored; that means that no codepage support is available (to make code simpler).

When CXXMLFile reads a file it creates an tag tree with three types of nodes, as shown in this illustration:




Click here for larger image

Legend:

  • CElementPart. Abstract base class for all node types. It includes the common member variable 'm_Text'.
  • CText. Derived from CElementPart. It represents a piece of text; it can be surrounded by any kind of nodes. Symbol replacement will be applied here, for example &nbsp; in HTML code.
  • CComment. A comment in form <!--something-->. It is not affected by symnbol replacement.
  • CElement. A tag. It has three properties: m_Text (the tag name), AttributeMap (a string map atname -> at value), and by inheritance a CList of nodes.

Then, only a CElement node type should have child subtypes.

CXXMLFile Class Reference

List of all members.

Public Methods

CXXMLFile ()
Constructor.

virtual ~CXXMLFile ()
Destructor.

void RemoveAll ()
Delete all entries and make a default root node.

void AddSymbol (CString coded, CString decoded)
Adds a symbol.

void DefaultSymbols ()
Set default symbols.

void ClearSymbols ()
Clear symbols.

void AddOpenTag (CString tag)
Adds an open tag (HTML ).

bool Write ()
Writes XML file.

CString GetFile ()
Get XML filename.

void SetFile (CString filename)
Set XML filename.

bool Read ()
The read XML function.

CElementPart * Root ()
Gets the root. (Create one if it's empty.)

CElementPart * AddElement (CElementPart *Parent)
Adds a node type element.

CElementPart * AddComment (CElementPart *Parent, CString text)
Adds a comment node.

CElementPart * AddText (CElementPart *Parent, CString text)
Adds a text node.

void SetText (CElementPart *node, CString text)
Sets text property in a node.

void GetText (CElementPart *node, CString &text)
Gets text property in a node.

bool IsElement (CElementPart *node)
Determines wheter node is element.

bool IsComment (CElementPart *node)
Determines wheter node is comment.

bool IsText (CElementPart *node)
Determines wheter node is text.

CMapStringToString * GetElementAttrMap (CElementPart *node)
Returns a pointer to the attribute map of the element.

bool BuildChildList (CElementPart *node, CList< CElementPart *, CElementPart * > &l)
Builds a list of child nodes.

 

Public Attributes

CStringList m_ErrorList

Private Methods

void Decodify (CString &html)
Decodify using symbol table.

void Codify (CString &html)
Codify using symbol table.

void WritePart (CStdioFile *f, CElementPart *p, int Depth, bool bNoIdent=false)
Writes an XML node to a file (used by Write only).

Private Attributes

CElementPart * m_Root
CMapStringToString m_Symbols
CString m_Filename
CMapStringToString m_OpenTags

Header


// XXMLFile.h: interface for the CXXMLFile class.
/////////////////////////////////////////////////////////////////

#if !defined(
 AFX_XXMLFILE_H__F5E3CD25_0B84_4191_A1A7_B3669180DFFA__INCLUDED_)
#define AFX_XXMLFILE_H__F5E3CD25_0B84_4191_A1A7_B3669180DFFA__INCLUDED_

#if _MSC_VER > 1000
#pragma once
#endif // _MSC_VER > 1000

#include <afxtempl.h>


//*****************************************************************//
/*!  \class CXXMLFile
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief A class to work with XML files as tree collections.
 */
class CXXMLFile  {
public:
  CXXMLFile();
  virtual ~CXXMLFile();
public:
  // Abstract node
class CElementPart;
  class CElementPart{
  public:
    CElementPart* m_Parent;
    CString m_Text;
    enum TType { TElement, TText, TComment } m_Type;
  public:
    CElementPart(){};
    virtual ~CElementPart(){};
  };

  // Element node
class CElement : public CElementPart, 
                 public CList<CElementPart*, CElementPart*>{
  public:
    CElement(){ m_Type=TElement; };
    virtual ~CElement();
  public:
    CMapStringToString AttributeToValue;
  };

  // Text node
  class CText : public CElementPart{
  public:
    CText(){ m_Type=TText; };
  };

  // Comment node
  class CComment : public CElementPart{
  public:
    CComment(){ m_Type=TComment; };
  };
public:
  // The error list, after executing Write() or Read() (if any)
  CStringList m_ErrorList;

  // Symbol table functions
  void RemoveAll();
  void AddSymbol(CString coded, CString decoded);
  void DefaultSymbols();
  void ClearSymbols();
  void AddOpenTag(CString tag);

  // File manipulation routines
  bool Write();
  CString GetFile();
  void SetFile(CString filename);
  bool Read();

  // Tree manipulation routines
  CElementPart* Root();
  CElementPart* AddElement(CElementPart* Parent);
  CElementPart* AddComment(CElementPart* Parent, CString text);
  CElementPart* AddText(CElementPart* Parent, CString text);
  void SetText(CElementPart* node, CString text);
  void GetText(CElementPart* node, CString &text);
  bool IsElement(CElementPart* node);
  bool IsComment(CElementPart* node);
  bool IsText(CElementPart* node);
  CMapStringToString* GetElementAttrMap(CElementPart* node);
  bool BuildChildList( CElementPart* node, 
                       CList<CElementPart*,
                       CElementPart*> &l);
private:
  // The XML tree
  CElementPart* m_Root;

  // Symbol codification functions
  void Decodify(CString &html);
  void Codify(CString &html);
  CMapStringToString m_Symbols; // encoded to decoded

  // Part management
  void WritePart( CStdioFile *f,
                  CElementPart * p,
                  int Depth,
                  bool bNoIdent=false);

  // Filename
  CString m_Filename;

  // Open tags map (acts like a set, so second value is ignored)
  CMapStringToString m_OpenTags;
};

#endif   // !defined(
   // AFX_XXMLFILE_H__F5E3CD25_0B84_4191_A1A7_B3669180DFFA__INCLUDED_)

Source


// XXMLFile.cpp: implementation of the CXXMLFile class.
//
//////////////////////////////////////////////////////////////////////

#include "stdafx.h"
#include "XXMLFile.h"

#ifdef _DEBUG
#undef THIS_FILE
static char THIS_FILE[]=__FILE__;
#define new DEBUG_NEW
#endif

//////////////////////////////////////////////////////////////////////
// Construction/Destruction
//////////////////////////////////////////////////////////////////////


//******************************************************************//
/*!  \fn CXXMLFile::CXXMLFile()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Constructor
 */
CXXMLFile::CXXMLFile()
{
  // The root is null
  m_Root=NULL;

  // clean tables, and create a default root node
  RemoveAll();

  // set default symbols and open tags
	DefaultSymbols();
}


//******************************************************************//
/*!  \fn CXXMLFile::~CXXMLFile()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Destructor
 */
CXXMLFile::~CXXMLFile()
{
  // clean up root (and then, the full tree)
  if(m_Root!=NULL) delete m_Root;
}


//******************************************************************//
/*!  \fn CXXMLFile::CElement::~CElement()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Element destructor (clean up the list of elements)
 */
CXXMLFile::CElement::~CElement(){
  // You must clean up the possible list of elements
  while(!IsEmpty()) delete RemoveHead();
};


//******************************************************************//
/*!  \fn bool IsSeparator(char ch){
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Ask for an XML separator 
 */
static bool IsSeparator(char ch){
  switch (ch){
  case ' ': return true;
  case '\t': return true;
  case '\r': return true;
  case '\n': return true;
  default: return false;
  };
};

//******************************************************************//
/*!  \fn bool HopSeparators(CString &html, 
 *                          int &pos,int FileRow)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief For internal use (Read()). It jumps over separators in 
 * a string by incrementing an integer pointer.
 */
static bool HopSeparators( CString &html,
                           int &pos,
                           int FileRow){
  while( (pos<html.GetLength()) &&
         (IsSeparator(html.GetAt(pos))) ){
    if(html.GetAt(pos)=='\n') 
      FileRow++;
    pos++;
  }
  if(pos>=html.GetLength()) return false; else return true;
};

//******************************************************************//
/*!  \fn int CountChars(CString s,char ch)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief A brute-force character counter.
 */
static int CountChars(CString s,char ch){
  return s.Replace(CString()+ch,"");
};

//******************************************************************//
/*!  \fn int FindChars(CString &html, int pos, LPCTSTR chars)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief For internal use (Read()). It secuentially look for any
 * char in a given set from an starting pointer in a string.
 */
static int FindChars(CString &html, int pos, LPCTSTR chars){
  int tmp=pos;
  CString seps = chars;
  while(tmp<html.GetLength()) {
    if(seps.FindOneOf( CString(html.GetAt(tmp)) )!=-1){
      return tmp;
    }
    tmp++;
  }
  return -1;
};


//******************************************************************//
/*!  \fn bool CXXMLFile::Read()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief The read XML function.
 */
bool CXXMLFile::Read()
{
  CString filename = m_Filename;
  int FileRow=1;

  // Init Tree
  if(m_Root!=NULL) delete m_Root;
  CElementPart** crut = &m_Root;
  CElement* Element = new CElement();
  Element->m_Parent=(*crut);
  (*crut)=Element;
  ((CElement*)(*crut))->m_Text = "?root?";

  // Clean up errors
  m_ErrorList.RemoveAll();

  // A brute-force reading of a text file (the most faster)
  CString html;
  TRY{
    CFile * f = new CFile(filename,
                          CFile::modeRead|CFile::shareDenyNone);
    if(f==NULL) ::AfxThrowFileException(CFileException::fileNotFound,
                                        -1,
                                        filename);
    html.Empty();
    char * buf = html.GetBufferSetLength(f->GetLength());
    f->SeekToBegin();
    f->Read(buf,f->GetLength());
    f->Close();
    delete f;
    html.ReleaseBuffer();
    html.FreeExtra();
    html.Replace("\r\n","\n");
  }CATCH_ALL(e){
    m_ErrorList.AddTail("Error: File not found.");
    return false;
  }END_CATCH_ALL

  int p1,p2,p3,p4;
  p1=p2=p3=p4=0;
  while(true){
    if(p1>=html.GetLength()) return true;
    p2 = html.Find('<',p1);
    if(p2==-1){
     m_ErrorList.AddTail("Warning at line "+CString(FileRow)+
                         ": There's some text before EOF, ignoring.");
      return true;
    } 
    CString text = html.Mid(p1,p2-p1);
    Decodify(text);
    if(!text.IsEmpty()){
      if( (*crut)==NULL ){
        m_ErrorList.AddTail("Warning at line "+CString(FileRow)+
 ": No tag active but text found, must be at the start, ignoring.");
      } else {
        CText * t = new CText();
        t->m_Text = text;
        ((CElement*)(*crut))->AddTail( t );
        FileRow+=CountChars(text,'\n');
      }
  } 

    // tag part
    p1=p2+1>;
    if(p1>=html.GetLength()){ 
      m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                          ": Tag started buf EOF found.");
      return false>;
    } else
    if(html.Mid(p1,3)=="!--"){ // comment
      p1=p1+3;
      p2 = html.Find("-->",p1);
  	  CString text = html.Mid(p1,p2-p1);
  	  if(p2==-1){
  	  	m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Comment tag unclosed.");
  	    return false;
      }
      if( (*crut)==NULL ){
        m_ErrorList.AddTail("Warning at line "+CString(FileRow)+
                        ": No tag active but text found, ignoring.");
      } else {
        CComment * t = new CComment();
        t->m_Text = text;
        ((CElement*)(*crut))->AddTail( t );
        FileRow+=CountChars(text,'\n');
      }
      p1=p2+=3;
    } else 
    if(html.GetAt(p1)=='?'){ // ?xml or something to avoid
      p2 = html.Find("?>",p1+1);
      if(p2==-1){
        m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Tag <? unclosed.");
        return false;
      } else {
        p1=p2+2;
        continue;
      }
    } else 
    if(html.GetAt(p1)=='/'){ // an end tag
      p2 = html.Find('>',p1);
      if(p2==-1){
        m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Tag unclosed.");
        return false;
      }
      p1++;
      CString tagname = html.Mid(p1,p2-p1);
      tagname.TrimLeft();
      tagname.TrimRight();
      if( (*crut)==NULL ){
        m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Closing tag when no tag???.");
        return false;
      }
      if( (*crut)->m_Text!=tagname ){
        m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Closing tag differs from open tag.");
        return false;
      }
      (*crut)=(*crut)->m_Parent;
      p1=p2+1;
    } else { // a start tag
      // < _ tag ...
      if(!HopSeparators(html,p1,FileRow)){
        m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Unspected EOF.");
        return false;
      }
      // < tag _ ...
      p2=FindChars(html,p1," \t\r\n>");
      if(p2==-1){
        m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                            ": Unspected EOF.");
        return false;
      }
      CString tag = html.Mid(p1,p2-p1);
      CElement* Element = new CElement();
      Element->m_Parent=(*crut);
      ((CElement*)(*crut))->AddTail( ((CElementPart*)Element) );
      (*crut)=Element;
      ((CElement*)(*crut))->m_Text = tag;
      p1=p2;
      while(true){
        // _ >... 
        // _ value="...
        if(!HopSeparators(html,p1,FileRow)){
          m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                              ": Unspected EOF.");
          return false;
        }
        if(html.GetAt(p1)=='>'){
          // _ >... 
          p1+=1;
          tag.MakeLower(); // tag string will be no longer used
          CString value;
          if(m_OpenTags.Lookup(tag,value)){
            // It's an open tag, that means that it 
            // will never be closed
            // Eg: <br> in HTML
            (*crut)=(*crut)->m_Parent; // closing node
          }
          break;
        };
        if(html.Mid(p1,2)=="/>"){
          // _ />... 
          // This kind of tag means that no close tag is expected.
          p1+=2;
          (*crut)=(*crut)->m_Parent; // collapse node
          break;
        };
        // _ value="...
        p2=html.Find("=\"",p1);
        if(p2==-1){
          m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                              ": Unspected value form.");
          return false;
        }
        CString valname = html.Mid(p1,p2-p1);
        valname.TrimLeft();
        valname.TrimRight();
        p1=p2+2;
        CString value;
        while(true){
          if(p1>=html.GetLength()){
            m_ErrorList.AddTail("Error at line "+CString(FileRow)+
                                ": Unspected EOF.");
            return false;
          } else
          if(html.GetAt(p1)=='\"'){
            p1++;
            break;
          } else
          if(html.Mid(p1,2)=="\\\""){
            value+="\"";
            p1+=2;
          } else {
            value+=html.GetAt(p1);
            p1++;
          }
        }
        Element->AttributeToValue[valname]=value;
      }
    }
  }
}


//******************************************************************//
/*!  \fn void CXXMLFile::SetFile(CString filename)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Set XML filename.
 */
void CXXMLFile::SetFile(CString filename)
{
  m_Filename=filename;
}


//******************************************************************//
/*!  \fn CString CXXMLFile::GetFile()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Get XML filename.
 */
CString CXXMLFile::GetFile()
{
  return m_Filename;
}


//******************************************************************//
/*!  \fn void CXXMLFile::WritePart(CStdioFile *f,
 *                                 CXXMLFile::CElementPart * p,
 *                                 int Depth,
 *                                 bool bNoIdent){
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Writes an XML node to a file (used by Write only).
 *
 *   \param f The target file.
 *   \param p The tree node.
 *   \param Depth The node depth (for identation).
 *   \param bNoIdent Boolean value used to cancel identation 
 * if the node is alone, that is, without children and siblings.
 */
void CXXMLFile::WritePart( CStdioFile *f,
                           CElementPart * p,
                           int Depth,
                           bool bNoIdent){
  int j;
  if(p->m_Type==CElementPart::TElement){
    {// Write identation
      // Avoid line-feed at the begining of the file
      if(f->GetPosition()!=0) 
        f->WriteString("\n"); 
      // the Identation with spaces
      for(j=0;j<Depth;j++) 
        f->WriteString(" ");
    }
    // Write tag header
    f->WriteString("<"+p->m_Text);

    // Write values
    POSITION pos = ((CElement*)p)->
                      AttributeToValue.GetStartPosition();
    while(pos!=NULL){
      CString AtName,AtValue;
      ((CElement*)p)->AttributeToValue.GetNextAssoc(pos,
                                                       AtName,
                                                       AtValue);
      f->WriteString(" "+AtName+"=\""+AtValue+"\"");
    };

    // If the Element does not have subelements, sub-texts, or 
    // sub-comments write it closed
    if(((CElement*)p)->IsEmpty()) {
      // Is an open tag (like HTML <br>)?
      CString tag;
      tag = p->m_Text;
      tag.MakeLower();
      CString value;
      if(m_OpenTags.Lookup(tag,value)){
        // It's an open tag, that means that it will never be closed
        // Eg: <br> in HTML
        f->WriteString(">");
      } else {
        // Not an open tag
        f->WriteString("/>");
      }
    } else {
      f->WriteString(">");

      // Optimization to block ident on single subtext in element.
      bool NoIdent = false;
      if(((CElement*)p)->GetCount()==1)
        NoIdent=true;

      // For each element...
      pos = ((CElement*)p)->GetHeadPosition();
      while(pos!=NULL){
        CElementPart*e = ((CElement*)p)->GetAt(pos);
        WritePart(f,e,Depth+1,NoIdent);
        ((CElement*)p)->GetNext(pos);
      }

      // Optimization to block ident on single subtext in element.
      if(!NoIdent){
        {// Write identation
          // Avoid line-feed at the begining of the file
          if(f->GetPosition()!=0) 
            f->WriteString("\n"); 
          // the Identation with spaces
          for(j=0;j<Depth;j++) 
            f->WriteString(" ");
        }      
      }

      // Close tag
      f->WriteString("</"+p->m_Text+">");
    }
  } else 
  if(p->m_Type==CElementPart::TComment){
    // Write comment AS-IS
    {// Write identation
      // Avoid line-feed at the begining of the file
      if(f->GetPosition()!=0) 
        f->WriteString("\n"); 
      // the Identation with spaces
      for(j=0;j<Depth;j++) 
        f->WriteString(" ");
    }
    f->WriteString("<!--"+p->m_Text+"-->");
  } else 
  if(p->m_Type==CElementPart::TText){
    // Write text if it's not empty (an empty text can have sparators)
    CString empty_string = p->m_Text;
    empty_string.Replace('\n',' ');
    empty_string.Replace('\r',' ');
    empty_string.Replace('\t',' ');
    while(0!=empty_string.Replace("  "," "));
    if((!empty_string.IsEmpty())&&(empty_string!=" ")) {
      if(!bNoIdent){
        {// Write identation
          // Avoid line-feed at the begining of the file
          if(f->GetPosition()!=0) 
            f->WriteString("\n"); 
          // the Identation with spaces
          for(j=0;j<Depth;j++) 
            f->WriteString(" ");
        }
      }

      // The string must be written with symbol codification
      CString text = p->m_Text;
      Codify(text);
      f->WriteString(text);
    }
  } 
};


//******************************************************************//
/*!  \fn bool CXXMLFile::Write()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Writes XML file.
 */
bool CXXMLFile::Write()
{
  m_ErrorList.RemoveAll();

  // check tree
  if(m_Root==NULL) {
    m_ErrorList.AddTail("Error: NULL tree.");
    return false;
  }
  if(m_Root->m_Type!=CElementPart::TElement) {
    m_ErrorList.AddTail("Error: tree root is not an Element.");

    return false; // root must be TElement
  }
  if(((CElement*)m_Root)->m_Text!="?root?") {
    m_ErrorList.AddTail("Error: tree root is not named ?root?");
    return false; // bad-written root
  }

  // Ok write file;
  CStdioFile * f = new CStdioFile(m_Filename,
                   CFile::modeCreate|CFile::modeWrite|
                   CFile::shareDenyNone|CFile::typeText);
  if(f==NULL){
    m_ErrorList.AddTail("Error: cannot open '"+m_Filename+
                        "' for writing.");
    return false; 
  } 

  CElement * root = ((CElement*)m_Root);
  POSITION pos = root->GetHeadPosition();
  while(pos!=NULL){
    CElementPart * p = root->GetAt(pos);
    WritePart(f,p,0);
    root->GetNext(pos);
  }

  f->Close();
  delete f;

  return true;
}


//******************************************************************//
/*!  \fn void CXXMLFile::DefaultSymbols()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Set default symbols.
 */
void CXXMLFile::DefaultSymbols()
{
  AddSymbol("&lt;","<");
  AddSymbol("&gt;",">");
  AddSymbol("&quot;","\"");
  AddSymbol("&nbsp;"," ");
  AddSymbol("&apos;","'");
  AddSymbol("&amp;","&");
  m_OpenTags["br"];
}


//******************************************************************//
/*!  \fn void CXXMLFile::ClearSymbols()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Clear symbols.
 */
void CXXMLFile::ClearSymbols()
{
  // Clear symbol table
  m_Symbols.RemoveAll();

  // Remove open tags
  m_OpenTags.RemoveAll();
}


//******************************************************************//
/*!  \fn void CXXMLFile::AddSymbol(CString coded, CString decoded)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Adds a symbol.
 */
void CXXMLFile::AddSymbol(CString coded, CString decoded)
{
  m_Symbols[coded]=decoded;
}


//******************************************************************//
/*!  \fn void CXXMLFile::RemoveAll()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Delete all entries and make a default root node.
 */
void CXXMLFile::RemoveAll()
{
  // Clear symbol table
  ClearSymbols();

  // Remove open tags
  m_OpenTags.RemoveAll();

  // Clear tree and create the default main node
  if(m_Root!=NULL) delete m_Root;
  CElementPart** crut = &m_Root;
  CElement* Element = new CElement();
  Element->m_Parent=(*crut);
  (*crut)=Element;
  ((CElement*)(*crut))->m_Text = "?root?";
}


//******************************************************************//
/*!  \fn void CXXMLFile::Codify(CString &html)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Codify using symbol table.
 */
void CXXMLFile::Codify(CString &html)
{
  int pos = 0;
  while(pos<html.GetLength()){
    POSITION p = m_Symbols.GetStartPosition();
    while(p!=NULL){
      CString coded, decoded;
      m_Symbols.GetNextAssoc(p,coded,decoded);
      if((pos+decoded.GetLength())<=html.GetLength()){
        if(html.Mid(pos,decoded.GetLength())==decoded){
          html = html.Left(pos) + coded + 
                 html.Mid(pos+decoded.GetLength());
          pos = pos + coded.GetLength()-1;
          break;
        }
      }
    }
    pos++;
  }
}


//******************************************************************//
/*!  \fn void CXXMLFile::Decodify(CString &html)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Decodify using symbol table.
 */
void CXXMLFile::Decodify(CString &html)
{
  int pos = 0;
  while(pos<html.GetLength()){
    POSITION p = m_Symbols.GetStartPosition();
    while(p!=NULL){
      CString coded, decoded;
      m_Symbols.GetNextAssoc(p,coded,decoded);
      if((pos+coded.GetLength())<=html.GetLength()){
        if(html.Mid(pos,coded.GetLength())==coded){
          html = html.Left(pos) + decoded + 
                 html.Mid(pos+coded.GetLength());
          pos = pos + decoded.GetLength()-1;
          break;
        }
      }
    }
    pos++;
  }
}

//******************************************************************//
/*!  \fn void CXXMLFile::AddOpenTag(CString tag)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Adds an open tag. (HTML <br>)
 */
void CXXMLFile::AddOpenTag(CString tag){
  tag.MakeLower();
  m_OpenTags[tag];
}

//******************************************************************//
/*!  \fn CXXMLFile::CElementPart* CXXMLFile::Root()
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Gets the root. (Create one if it's empty).
 */
CXXMLFile::CElementPart* CXXMLFile::Root(){
  if(m_Root!=NULL) 
    return (CXXMLFile::CElementPart*)m_Root;
  CElementPart** crut = &m_Root;
  CElement* Element = new CElement();
  Element->m_Parent=(*crut);
  (*crut)=Element;
  ((CElement*)(*crut))->m_Text = "?root?";
  return Element;
}

//******************************************************************//
/*!  \fn CXXMLFile::CElementPart* 
                        CXXMLFile::AddElement(CElementPart* Parent)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Adds a node type element
 */
CXXMLFile::CElementPart* CXXMLFile::AddElement(CElementPart* Parent){
  // Only TElement nodes support childs
  if(Parent->m_Type!=CXXMLFile::CElementPart::TElement)
    return NULL;
  CXXMLFile::CElement * elem = (CXXMLFile::CElement*)Parent;  
  CXXMLFile::CElement * new_elem = new CElement();
  elem->AddTail( ((CXXMLFile::CElementPart*)new_elem) );
  return new_elem;
}


//******************************************************************//
/*!  \fn void CXXMLFile::SetText(CElementPart* node, CString text)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Sets text property in a node.
 */
void CXXMLFile::SetText(CElementPart* node, CString text){
  node->m_Text = text;
};

//******************************************************************//
/*!  \fn void CXXMLFile::GetText(CElementPart* node, 
                                 CString &text)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Gets text property in a node.
 */
void CXXMLFile::GetText(CElementPart* node, CString &text){
  text = node->m_Text;
};

//******************************************************************//
/*!  \fn CXXMLFile::CElementPart* 
           CXXMLFile::AddComment(CElementPart* Parent, CString text)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Adds a comment node
 */
CXXMLFile::CElementPart* CXXMLFile::AddComment(CElementPart* Parent,
                                               CString text){
  // Only TElement nodes support childs
  if(Parent->m_Type!=CXXMLFile::CElementPart::TElement)
    return NULL;
  CXXMLFile::CElement * elem = (CXXMLFile::CElement*)Parent;  
  CXXMLFile::CComment * new_elem = new CComment();
  elem->AddTail( ((CXXMLFile::CElementPart*)new_elem) );
  return new_elem;
};

//******************************************************************//
/*!  \fn CXXMLFile::CElementPart* 
           CXXMLFile::AddText(CElementPart* Parent, CString text)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Adds a text node
 */
CXXMLFile::CElementPart* CXXMLFile::AddText(CElementPart* Parent,
                                            CString text){
  // Only TElement nodes support childs
  if(Parent->m_Type!=CXXMLFile::CElementPart::TElement)
    return NULL;
  CXXMLFile::CElement * elem = (CXXMLFile::CElement*)Parent;  
  CXXMLFile::CText * new_elem = new CText();
  elem->AddTail( ((CXXMLFile::CElementPart*)new_elem) );
  return new_elem;
}

//******************************************************************//
/*!  \fn bool CXXMLFile::IsElement(CElementPart* node)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief  Determines wheter node is element. 
 */
bool CXXMLFile::IsElement(CElementPart* node)
{
  return (node->m_Type==CXXMLFile::CElementPart::TElement);
}

//******************************************************************//
/*!  \fn bool CXXMLFile::IsComment(CElementPart* node)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Determines wheter node is comment. 
 */
bool CXXMLFile::IsComment(CElementPart* node)
{
  return (node->m_Type==CXXMLFile::CElementPart::TComment);
}

//******************************************************************//
/*!  \fn bool CXXMLFile::IsText(CElementPart* node)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Determines wheter node is text. 
 */
bool CXXMLFile::IsText(CElementPart* node)
{
  return (node->m_Type==CXXMLFile::CElementPart::TText);
}

//******************************************************************//
/*!  \fn CMapStringToString* 
                CXXMLFile::GetElementAttrMap(CElementPart* node)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Returns a pointer to the attribute map of the element.
 */
CMapStringToString* CXXMLFile::GetElementAttrMap(CElementPart* node)
{
  // Only TElement nodes support childs
  if(node->m_Type!=CXXMLFile::CElementPart::TElement)
    return NULL;
  CXXMLFile::CElement * elem = (CXXMLFile::CElement*)node;  
  return &elem->AttributeToValue;
}

//******************************************************************//
/*!  \fn bool CXXMLFile::BuildChildList(CElementPart* node,
                                        CList<CElementPart*,
                                        CElementPart*> &l)
 *   \author Manuel Lucas Viqas Livschitz (MLVL)
 *   \date 04/05/2002
 *   \brief Builds a list of child nodes.
 */
bool CXXMLFile::BuildChildList(CElementPart* node,
                               CList<CElementPart*,
                               CElementPart*> &l)
{
  // Only TElement nodes support childs
  if(node->m_Type!=CXXMLFile::CElementPart::TElement)
    return false;
  l.RemoveAll();
  l.AddTail( ((CElementPart*)node) );
  return true;
}

Example

The simpler example is trying to load an xml file, converting in a CXMLFile collection, and then trying to revert it to an xml file.

We will do the test with books.xml (from MSDN XML SDK)

Source Code (extract)

// XMLTest.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "XMLTest.h"

....

// read-write test
    CXXMLFile xml;
    xml.SetFile("books.xml");
    bool bok = xml.Read();

    if(!bok){
      POSITION pos = xml.m_ErrorList.GetHeadPosition();
      while(pos!=NULL){
        printf(xml.m_ErrorList.GetAt(pos)+"\n");
        xml.m_ErrorList.GetNext(pos);
      }
    } else {
      xml.SetFile("books_out.xml");
      xml.ClearSymbols(); // no html symbols (to avoid convering
                          // spaces into &nbsp;
      bok = xml.Write();
      if(!bok){
        POSITION pos = xml.m_ErrorList.GetHeadPosition();
        while(pos!=NULL){
          printf(xml.m_ErrorList.GetAt(pos)+"\n");
          xml.m_ErrorList.GetNext(pos);
        }
      }
    }

Result

Books.xml
<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
   <book id="bk103">
      <author>Corets, Eva</author>
      <title>Maeve Ascendant</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-11-17</publish_date>
      <description>After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.</description>
   </book>
   <book id="bk104">
      <author>Corets, Eva</author>
      <title>Oberon's Legacy</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-03-10</publish_date>
      <description>In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.</description>
   </book>
   <book id="bk105">
      <author>Corets, Eva</author>
      <title>The Sundered Grail</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-09-10</publish_date>
      <description>The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.</description>
   </book>
   <book id="bk106">
      <author>Randall, Cynthia</author>
      <title>Lover Birds</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-09-02</publish_date>
      <description>When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.</description>
   </book>
   <book id="bk107">
      <author>Thurman, Paula</author>
      <title>Splish Splash</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2000-11-02</publish_date>
      <description>A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.</description>
   </book>
   <book id="bk108">
      <author>Knorr, Stefan</author>
      <title>Creepy Crawlies</title>
      <genre>Horror</genre>
      <price>4.95</price>
      <publish_date>2000-12-06</publish_date>
      <description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description>
   </book>
   <book id="bk109">
      <author>Kress, Peter</author>
      <title>Paradox Lost</title>
      <genre>Science Fiction</genre>
      <price>6.95</price>
      <publish_date>2000-11-02</publish_date>
      <description>After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.</description>
   </book>
   <book id="bk110">
      <author>O'Brien, Tim</author>
      <title>Microsoft .NET: The Programming Bible</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-09</publish_date>
      <description>Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.</description>
   </book>
   <book id="bk111">
      <author>O'Brien, Tim</author>
      <title>MSXML3: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2000-12-01</publish_date>
      <description>The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.</description>
   </book>
   <book id="bk112">
      <author>Galos, Mike</author>
      <title>Visual Studio 7: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>49.95</price>
      <publish_date>2001-04-16</publish_date>
      <description>Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.</description>
   </book>
</catalog>
 
Books_out.xml
<catalog>
 <book id="bk101">
  <author>Gambardella, Matthew</author>
  <title>XML Developer's Guide</title>
  <genre>Computer</genre>
  <price>44.95</price>
  <publish_date>2000-10-01</publish_date>
  <description>An in-depth look at creating applications 
      with XML.</description>
 </book>
 <book id="bk102">
  <author>Ralls, Kim</author>
  <title>Midnight Rain</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2000-12-16</publish_date>
  <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
 </book>
 <book id="bk103">
  <author>Corets, Eva</author>
  <title>Maeve Ascendant</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2000-11-17</publish_date>
  <description>After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.</description>
 </book>
 <book id="bk104">
  <author>Corets, Eva</author>
  <title>Oberon's Legacy</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2001-03-10</publish_date>
  <description>In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.</description>
 </book>
 <book id="bk105">
  <author>Corets, Eva</author>
  <title>The Sundered Grail</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2001-09-10</publish_date>
  <description>The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.</description>
 </book>
 <book id="bk106">
  <author>Randall, Cynthia</author>
  <title>Lover Birds</title>
  <genre>Romance</genre>
  <price>4.95</price>
  <publish_date>2000-09-02</publish_date>
  <description>When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.</description>
 </book>
 <book id="bk107">
  <author>Thurman, Paula</author>
  <title>Splish Splash</title>
  <genre>Romance</genre>
  <price>4.95</price>
  <publish_date>2000-11-02</publish_date>
  <description>A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.</description>
 </book>
 <book id="bk108">
  <author>Knorr, Stefan</author>
  <title>Creepy Crawlies</title>
  <genre>Horror</genre>
  <price>4.95</price>
  <publish_date>2000-12-06</publish_date>
  <description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description>
 </book>
 <book id="bk109">
  <author>Kress, Peter</author>
  <title>Paradox Lost</title>
  <genre>Science Fiction</genre>
  <price>6.95</price>
  <publish_date>2000-11-02</publish_date>
  <description>After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.</description>
 </book>
 <book id="bk110">
  <author>O'Brien, Tim</author>
  <title>Microsoft .NET: The Programming Bible</title>
  <genre>Computer</genre>
  <price>36.95</price>
  <publish_date>2000-12-09</publish_date>
  <description>Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.</description>
 </book>
 <book id="bk111">
  <author>O'Brien, Tim</author>
  <title>MSXML3: A Comprehensive Guide</title>
  <genre>Computer</genre>
  <price>36.95</price>
  <publish_date>2000-12-01</publish_date>
  <description>The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.</description>
 </book>
 <book id="bk112">
  <author>Galos, Mike</author>
  <title>Visual Studio 7: A Comprehensive Guide</title>
  <genre>Computer</genre>
  <price>49.95</price>
  <publish_date>2001-04-16</publish_date>
  <description>Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.</description>
 </book>
</catalog>
 

Downloads

CXXMLFile class
Read-Write Sample


Comments

  • CXMLParser::Read() is very slow with a 800K xml file

    Posted by gomyway on 01/07/2005 04:02am

    CString operation is slow when string is large. Is it possible to read/parse the file in small pieces? Great work anyway.

    Reply
  • how to retrieve items ?

    Posted by Legacy on 10/05/2003 12:00am

    Originally posted by: kitarolivier

    How can I "scan" the tree to get elements ?
    I get the Root, and then ? How can I get child and next items ?

    Thanx.

    • See the sample code

      Posted by gomyway on 01/07/2005 04:04am

      CXMLElement * root = ((CXMLElement*)xml.Root());
      		CXMLElement * ptr=root->GetRootOf("NeuralNetwork");
      
      		if(ptr==NULL) return;
      
      		POSITION pos = ptr->GetHeadPosition();
      		//Read Hidden Nodes
      		int jj=loj;
      		while(pos!=NULL){
      			CXMLElementPart * pHidden = ptr->GetAt(pos);
      			ptr->GetNext(pos);

      Reply
    Reply
  • int-to-string conversion missing...

    Posted by Legacy on 08/07/2003 12:00am

    Originally posted by: Leonhardt Wille

    Hi there,
    I get a compilation error that there's no constructor for CString(int).
    I used a small inline function for the conversion:
    (lines 177ff., bool CXXMLFile::Read())

    __forceinline CString itos(int i)
    {
    CString csRet;
    csRet.Format("%d",i);
    return csRet;
    }

    p.s. I forgot to say that your work is great

    Reply
  • Re: Encoding

    Posted by Legacy on 07/09/2002 12:00am

    Originally posted by: M.L. Vi�as Livschitz

    Encondig system was skipped to make the class easier, the encoding / decoding system is opensource and it's very easy for me to use it, but it will make the code very confusing. In addition CXXMLFile represents and XML tree in memory, this should not be possible with large data. But As I told this code is for people that needs a very simple class.

    Reply
  • try #import <msxml3.dll>

    Posted by Legacy on 05/17/2002 12:00am

    Originally posted by: Theos

    try 
    
    --------------------
    #import <msxml3.dll>
    using namespace MSXML2
    --------------------
    All you need is msxml sdk;
    You can access elements modify etc.
    and search

    Reply
Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: September 10, 2014 Modern mobile applications connect systems-of-engagement (mobile apps) with systems-of-record (traditional IT) to deliver new and innovative business value. But the lifecycle for development of mobile apps is also new and different. Emerging trends in mobile development call for faster delivery of incremental features, coupled with feedback from the users of the app "in the wild." This loop of continuous delivery and continuous feedback is how the best mobile …

  • On-demand Event Event Date: July 22, 2014 In this WhatWorks analysis, John Pescatore examines a use case where end users had local administrative rights on their PCs and it had gotten out of hand for this Fortune 500 Energy and Utilities company. The compelling event that prompted the company to reexamine this situation was the migration to Windows 7. In Windows XP, a custom tool that allowed users one of three levels of administrative rights to their workstations would need to be replaced during the Windows …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds