This is the second article of the tutorial. Because it is highly coupled with the first one, I recommend reading the first part before going any further.
Stating the Problem
In the first article, you saw what SAX is, what the Microsoft COM implementation of SAX is, and how you can write a simple parser of a XML document. During the first article, I highlighted a couple of times that you can only register one handler type (content handler, error handler, or DTD handler) at a time. Although this may suffice for some applications, it may be a drawback for others.
Take a look at the XML document you used in the first article.
<?xml version="1.0" encoding="utf-8"?> <store> <book isbn="10000001"> <title>The Lord Of The Rings</title> <author>J.R.R. Tolkien</author> </book> <book> <title>Maitreyi</title> <author>Mircea Eliade</author> </book> <cd> <title>The Wall</title> <artist>Pink Floyd</artist> <track length="3:40">Another Brick in the Wall</track> <track length="5:33">Mother</track> </cd> <cd> <title>Come on Over</title> <artist>Shania Twain</artist> <track length="4:40">From This Moment On</track> <track length="3:33">You're Still The One</track> </cd> </store>
You assumed that you had a store where books and CDs are sold and this XML document is a catalog of all items in the store. In the first article, you wrote an application using SAX that parsed this document creating lists of books and CDs and displayed them on the console. But, you had both the books and CDs processed by the same content handler.
Now you want more. You want to create a more complex application with separate components to handle the books and CDs. Moreover, you want a third component to handle only tracks, which means that tracks would be handled by two components at the same time. This scenario is not possible with the default implementation of SAX, so you’ll have to make it work.
Approaching the Problem
The solution you will see is based on the sample code provided in the first article. Class Book, CD, Track, Store, SAXErrorHandlerImpl, and SAXContentHandlerImpl are the same from the first article. Only SAXContentHandlerImpl, which was a base class for your former content handler, and continues to be so, has a new method, GetAttributeValue. As explained in the first article, this method is used to extract the value of an element’s attribute.
You will start by writing specialized content handlers for each category of information you are interested in. Thus, you will have three new content handlers: SAXBooks for books, SAXCDs for CDs, and SAXTracks for Tracks.
class SAXBooks : public SAXContentHandlerImpl { std::vector<Book> books; std::stack<void*> elements; bool hasText; long m_RefCount; public: SAXBooks(void); virtual ~SAXBooks(void); unsigned long __stdcall AddRef(void) { return InterlockedIncrement(&m_RefCount); } unsigned long __stdcall Release(void) { long nRefCount=0; nRefCount=InterlockedDecrement(&m_RefCount) ; if (nRefCount == 0) delete this; return nRefCount; } std::vector<Book> GetBooks() const {return books;} virtual HRESULT STDMETHODCALLTYPE startElement( wchar_t __RPC_FAR *pwchNamespaceUri, int cchNamespaceUri, wchar_t __RPC_FAR *pwchLocalName, int cchLocalName, wchar_t __RPC_FAR *pwchRawName, int cchRawName, ISAXAttributes __RPC_FAR *pAttributes); virtual HRESULT STDMETHODCALLTYPE endElement( wchar_t __RPC_FAR *pwchNamespaceUri, int cchNamespaceUri, wchar_t __RPC_FAR *pwchLocalName, int cchLocalName, wchar_t __RPC_FAR *pwchRawName, int cchRawName); virtual HRESULT STDMETHODCALLTYPE characters( wchar_t __RPC_FAR *pwchChars, int cchChars); }; class SAXTracks : public SAXContentHandlerImpl { std::vector<Track> tracks; std::stack<void*> elements; bool hasText; long m_RefCount; public: SAXTracks(void); virtual ~SAXTracks(void); unsigned long __stdcall AddRef(void) { return InterlockedIncrement(&m_RefCount); } unsigned long __stdcall Release(void) { long nRefCount=0; nRefCount=InterlockedDecrement(&m_RefCount) ; if (nRefCount == 0) delete this; return nRefCount; } std::vector<Track> GetTracks() const {return tracks;} virtual HRESULT STDMETHODCALLTYPE startElement( wchar_t __RPC_FAR *pwchNamespaceUri, int cchNamespaceUri, wchar_t __RPC_FAR *pwchLocalName, int cchLocalName, wchar_t __RPC_FAR *pwchRawName, int cchRawName, ISAXAttributes __RPC_FAR *pAttributes); virtual HRESULT STDMETHODCALLTYPE endElement( wchar_t __RPC_FAR *pwchNamespaceUri, int cchNamespaceUri, wchar_t __RPC_FAR *pwchLocalName, int cchLocalName, wchar_t __RPC_FAR *pwchRawName, int cchRawName); virtual HRESULT STDMETHODCALLTYPE characters( wchar_t __RPC_FAR *pwchChars, int cchChars); }; class SAXCDs : public SAXContentHandlerImpl { std::vector<CD> cds; std::stack<void*> elements; bool hasText; long m_RefCount; public: SAXCDs(void); virtual ~SAXCDs(void); unsigned long __stdcall AddRef(void) { return InterlockedIncrement(&m_RefCount); } unsigned long __stdcall Release(void) { long nRefCount=0; nRefCount=InterlockedDecrement(&m_RefCount) ; if (nRefCount == 0) delete this; return nRefCount; } std::vector<CD> GetCDs() const {return cds;} virtual HRESULT STDMETHODCALLTYPE startElement( wchar_t __RPC_FAR *pwchNamespaceUri, int cchNamespaceUri, wchar_t __RPC_FAR *pwchLocalName, int cchLocalName, wchar_t __RPC_FAR *pwchRawName, int cchRawName, ISAXAttributes __RPC_FAR *pAttributes); virtual HRESULT STDMETHODCALLTYPE endElement( wchar_t __RPC_FAR *pwchNamespaceUri, int cchNamespaceUri, wchar_t __RPC_FAR *pwchLocalName, int cchLocalName, wchar_t __RPC_FAR *pwchRawName, int cchRawName); virtual HRESULT STDMETHODCALLTYPE characters( wchar_t __RPC_FAR *pwchChars, int cchChars); };
They follow the exact same logic as SAXStore content handler from the first article. As elements are encountered, they are placed into a stack, where they are popped out when an end-element event is fired. If you still haven’t read the first article, it’s now time to do so.