Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
As many organizations today are moving to integrate information from the distributed applications, they face many challenges in extracting the data that exists within the proprietary environments. This article defines a general framework to simplify the communication and transformation of data between different applications.
The various approaches to Enterprise Application Integration (EAI) have evolved from defining interfaces to utilizing a variety of middleware technologies such as Message Brokers, J2EE Application Servers, COM, and CORBA. However, these technologies only partially address the challenges that organizations are facing today.
EAI projects are different from most projects that organizations have used in the past. They don't introduce the change that is isolated to individual application areas or business areas. Instead, EAI projects force change upon many application and business areas and require a coordinated approach among groups in an enterprise that used to deal with their application and infrastructure needs in a mostly independent way.
XML as an Interface
The disparate distributed applications need a common platform to communicate with one another. Because XML is not specific to any proprietary platform, it can very well fit in that. XML is not tied to a limited set of tags defined by proprietary vendors. In other words, it contains metadata. It allows each specific industry to develop its own tag sets to meet its unique needs. The following is the sample XML data for customer details:
<customer cust_id="3790"> <cust_name>Steve</cust_name> <address> <block>7432 Silver</block> <city>Columbia</city> <zip>89131</zip> <phone>2345678</phone> <mobile>4320659</mobile> <date>01/06/1999</date> </address> </customer>
XML handles both forward and backward data compatibilities. It is good at describing hierarchical data in a standard way. The data can be specified separately from the structure. The data structure can be defined by using either DTD or XML schema.
DTD lists elements that may appear in the XML document and the relationships among them. The XML schema defines the contents and semantics in addition to element relationships. The following is the DTD for the above XML data containing customer details:
<!ELEMENT customer (cust_name, address) > <!ATTLIST customer cust_id CDATA #REQUIRED> <!ELEMENT address (block, city, zip, phone+, fax*, mobile?)>
XSL is powerful enough for the transformation of data between different applications. It is a language that can transform, filter, and sort XML data, define parts of an XML document, format XML data based on the data value, and output XML data to different forms such as screens, paper, or voice. The following figure best represents the relationship between XML and XSL:
Enterprise Application Integration Using XML
Enterprise portals are developed to provide users with a single point of access to a variety of topics aggregated from within the enterprise, as well as from supplier or trading partners and the Web. This information will be represented in structured data stores. The feature of extensibility and the ability to arbitrarily define new mark-up languages enables XML to represent this information.
Organizations that are interested in aggregating disparate data using XML are faced with purchasing or developing custom "connector" applications that convert legacy formats to XML. These applications must perform three functions: extraction through proprietary interfaces, transformation into XML, and packaging for transmission.
In addition to intelligently marking up the content extracted from backend systems, organizations can utilize XML messaging to transport content between an application and a portal server. In this case, the XML message will contain metadata along with the content. From the metadata, a portal server can index the content, and give users the ability to retrieve the native data or document for viewing through the portal.
XML messaging is also important for transporting data back to the originating applications. This article presents one proposal for leveraging XML to facilitate the exchange of information between applications or trading partners. The XML schema, which is created based on the BizTalk framework, can provide an intelligent transport mechanism for routing information between the portal server and all the connected applications.
The XML messages can contain BizTalk routing details, portal instructions, and the actual content. BizTalk routing information provides the necessary information about the message sending and receiving applications. The XML schema defines the structure of the message, which may contain some instructions for the portal. This structure contains a set of actions that can be performed in the portal, metadata about the content, and the content itself.
The XML schema also provides definitions for saving, indexing, updating, and deleting documents, as well as functions for updating metadata related to a specific document. The metadata will be used later to search for this content inside of the portal.
The sample message given below shows a record from a database saved as an XML file with the XML structure stored in the <content> tag. With XML data, the portal can take advantage of the mark-up to enable structured searches using XQL.
The portal user can create a more detailed search for the data by using the tag and value pairs to find the data user needs. An additional attribute in the <save> tag, called <mode>, allows for different relationships to be defined between the portal and the external database.
The modes have been implemented to allow one-way or bi-directional channels to be established and maintained between the portal and the external data source. Thus, data can persist in the portal or an interactive relationship can be established, allowing the portal to update the external data source directly.
<?xml version="1.0"> <biztalk_1 xmlns="urn:biztalk-org:biztalk:biztalk_1"> <header> BizTalk Routing Details </header> <body> <packet xmlns="http://schemas.biztalk.org/sequoiasoftware_com/ myaxudtv.xml"> <save id="20" datasource="LegacyDB" doctype="Customer_Data" filetype="txt" mode="1"> <key name="cust_id">12300</key> <indexfield name="cust_name">Steve</indexfield> <content encoded="no"> <customer cust_id="12300"> <cust_name>Steve</cust_name> <address> <block>7432 Silver</block> <city>Columbia</city> <zip>21045</zip> <phone>2345678</phone> <mobile>4320659</mobile> <date>01/06/1999</date> </address> </customer> </content> </save> </packet> </body> </biztalk_l>
XML can be leveraged in many areas of enterprise application development. It can be used for data aggregation and management such as indexing, metadata, and so forth. It also can be used to exchange data among the distributed applications. XML schema provides the necessary elements for describing any kind of structured data. XML can be used, when desired, to establish and maintain dynamic relationships between the Web server and any other external data sources. XML messaging provides an effective method of enterprise integration for exchanging data in XML formats. It also provides the environments for businesses needing to exchange information in support of business-to-business e-commerce.