Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
Click here for larger image
Environment: Internet Explorer (IE) 5, Visual C++ 6 (SP3) There is no longer any need to introduce XML. One year ago it was still possible to bounce into a colleague who would not have not heard about XML. Today it has become a household name.
What has happenend is that XML is being incredibly rapidly promoted from yet another internet format to the de facto ligua-franca of interoperability. Huge investments are being made in XML based technologies in order to support the growing need for platform-neutral data interchange in B2B (Business to Business) and EAI (Enterprise Applicatio Integration) among other. This means that many a developer who was quietly enjoying the benefits of a hard-won proficiency in C++ will find himself propelled in the unfamiliar, ruthless world of XML.
Schema (or maybe shemata is the correct plural) are at the core of XML technology. Their role is to provide meta-data that descibe the semantics of the data contained whithin XML document (often reffered to as business documents). Shemas are to replace DTDs inherited from the SGML origins of XML. They are much easier to write than DTDs and by using XML syntax they make parsing simpler. The trouble is that, until the W3C finalizes the XML standard there is no real standard to work with.
Tools, on the other hand, are needed for yesterday (sounds familiar ? ;-). So what vendors do is that they implement temporary proprietary standards, pledging to move to the W3C one once it is released. This means that we are going to have to live with that legacy, but that's another story...
The microsoft current standard for XML shema is XDR. It corresponds roughly to a subset of the W3C in its current shape. While being much more legible than DTDs, XDR shemas are not easy reading either. If you need to drill down into any real-life example a tool will come as a relief to you.
XDRMDIVWXML is about data and data structure. Rendering occurs through applying sylesheets to XML. These sylesheets usualy live in XSL files. By inluding a <sylesheet> tag at the top of an XML file you tell what sort of rendering you want for your XML data.
Now let's drill down into XDRMDIVW. As you may have guessed it stands for XDR MDI Viewer. It is a straightforward MFC MDI application that uses CHtmlView in order to visualize XML documents.
The sylesheet used (xdr-schema.xsl, included in the zip file) can be downloded from microsoft.com. It renders a XDR shema as a hyperlinked cross-reference of its parts and make it more strightforward to understand how these parts relate to each other.
When you open a XML file (for some reason many XDR shemas come with a .xml extension) the program will create a temporary file, insert a <?xml-stylesheet type=\"text/xsl\" href=\"xdr-schema.xsl\"?> tag at the top of it and then append the contents of the source XML file to it. The temporay file is renamed with an .xml extension and loaded into the CHtmlView using the Navigate2() method.
To avoid clutering up the disk with temporary files they are removed when the CDocument is closed (OnCloseDocument).
Side IssuesI've used the CHyperLink class by Giancarlo Iovino for implementing hyperlinks on the about box. I do recommend using it, it's been just plug-and-play !
The program also illustrates two useful tricks
- Using the res: protocol with the WebBrowserControl for loading an HTML resource
- Subclassing the MDI Client so as to be able to display a HTML page in this valuable piece of screen real estate.
Known LimitationsI've strived to keep the code as clean as possible but the overall funtionnalities could do with some polishing up. Like :
- File Open Dialog multiselection
- Better looking and properly updated navigation buttons
- Proposing more stylesheets (let me know if you have one !)
If you have suggestions and comments I will be more than happy to read them and collaborate whith you to make this program more useful.
DownloadsDownload demo - 21 Kb (extract BOTH files in the same directory)
Download source - 61 Kb