Building the Right Environment to Support AI, Machine Learning and Deep Learning
As I travel, whether instructing or consulting, the topic of .NET versus Java continues to fuel heated debates: Which framework is better? Which framework is optimal? Which framework will dominate the corporate enterprise?
Well, I'm sure we all have our opinions—and quite valid ones at that—but as I see it, the truth of the matter isn't that .NET is superior to Java, or that Java is superior to .NET. We actually have two fantastic platforms on which we can develop enterprise solutions that drive corporate revenue (or so we hope) and serve to promote the business-computing environment for which we develop software. Pick your platform. They're both awesome and should serve you well.
And in fact, many corporate environments today truly are heterogeneous enterprise-computing environments. Accounting may be running a Java-based business layer that talks to an Oracle database that feeds data to a thick-client application. Human Relations may be using an all-Microsoft solution where the user interface is ASP- or ASP.NET-based with .NET or COM mid-tier business components feeding data to and retrieving data from a SQL Server database. And the marketing department, well, they're quite the interesting bunch. They're using Java beans for the mid-tier, which feed business information to an ASP-based presentation layer.
As I see it, Java and .NET are tools—powerful tools, but tools nonetheless—that we use to build our computational infrastructure in support of our businesses' revenue streams. And as with any tool, some are optimized for certain tasks whereas others are optimized for other tasks. .NET and Java definitely work in the same task space, but you may choose one or the other for some specific reason or set of reasons that are germane to the problem or problems you're currently trying to solve.
So, instead of arguing which is better, this article instead explores some ways in which the two can work together. Let's face it, both Java and .NET are here to stay and both have valid claims for your enterprise computing software work.
When I say work together, I'm specifically talking about interoperability. At its lowest level, interoperability defines how we share data. The typical problem we face, though, is that "data" for one system doesn't map directly into "data" in another. For example, in the Human Relations system I mentioned previously, the .NET CLR does not store the data structures that represent employee data in memory (as binary information) the same way Java would. In fact, those same data structures could be stored differently even between two Windows Server systems, if the servers are running different processor architectures (like a Pentium versus a MIPS processor or an Alpha processor). So, you can't just take the binary information directly from the address space of one system and shove it into another system if the two systems differ significantly. (Even if they don't differ, you typically don't transfer data using literal memory copies.)
Instead, you typically convert the data as it leaves one system into a form that the other system can consume. This data format might not be (and often isn't) in the native format of the second system. However, it will be encoded in a form that can be converted into the second system's native format, such as from XML or an agreed-upon binary protocol. If it weren't, you wouldn't be transferring data in the first place.
Now, just by reading the title of this article you might think I'm talking about .NET objects accessing Java objects on the same computer. While I certainly can't speak for every business solution out there, I do feel confident in saying that the vast majority of systems in play today probably aren't looking to interoperate at the binary level (akin to Microsoft's JDirect from years past, where Java would work directly with external dynamic link libraries). Instead, my travels and experience tell me that people are interested in having systems running Java "talk" with systems running .NET so that they can exploit the features inherent in each platform yet derive benefits where the sum is potentially greater than the single parts alone.
Because we're talking about different computers executing different runtime environments, we're really talking about interoperability between distributed systems. Typically, these are n-tier systems that involve several layers of processing, from database to presentation layer. Figure 1 shows a typical architecture.
Figure 1. Typical n-tier system
The question of interoperability comes into play when one or more of the tiers in the distributed application is built using differing technologies (.NET versus Java). How can you exploit the strengths of each platform while at the same time giving the consumers the illusion that they're dealing with a single system?
The trick is to decide where each platform is strong and not so strong, by your definition given the solution you're trying to piece together. In this case, I'll limit this discussion to the Web server shown in Figure 1. You can use SQL Server or Oracle for your database at any time, and the user can choose which browser will display the resulting information from your system. The Web server might use ASP.NET for the presentation subsystem, yet use J2EE for the business logic and data-access components. Or, you might use Java Struts to construct the user experience, yet write your transactional logic using .NET and C#. In both cases, you'd need to marry business logic components to the presentation layer.
This leads to the interoperability model, by which I mean the way you communicate data between layers. I see three primary models that should work well in today's enterprise environment:
- Web services
The assumption with these models is that the Microsoft-based components are running on a Windows platform, whereas the Java components are running on a Unix platform such as Linux. No matter what you do, in this situation you'll have to make off-system calls to integrate your subsystems. You could argue that CORBA or even DCOM could perform this integration, but both CORBA and DCOM suffer from the same troubling issue—both are partially or fully proprietary. Therefore, neither allows for seamless integration. (You could argue that .NET remoting also is proprietary, but I'll get to that in a moment.)
The most obvious and probably most common integration technique is to use the XML Web service as the data bridge between Java and .NET. This is, in fact, a primary goal of the XML Web service—to enable the integration of heterogeneous platforms. The data is transformed into a standard format (XML in SOAP form in this case), and each side both generates and consumes the SOAP XML packet.
XML Web services are exploding in popularity, and as a result, every major platform has support for them. .NET was designed and built with Web services in mind, and the Unix platform has several choices for venders, including IBM, BEA, and of course open-source solutions such as Apache and Tomcat.
Instead of rolling data into a proprietary format, like DCOM's object format, data is transformed from its binary representation in the source system into XML using XSD-compliant data types ("XSD" in this case refers to the XML Schema, Part II, "Datatypes," found at http://www.w3.org/TR/xmlschema-2/). XSD is capable of describing a great variety of primitive datatypes, such as integers, strings, and floating-point numbers. But XML in general can contain more complex data structures, as described in the SOAP specification (http://www.w3.org/TR/2000/NOTE-SOAP-20000508/ for SOAP version 1.1, and for SOAP 1.2 collectively, http://www.w3.org/TR/soap/). You can format your structures according to the SOAP specifications, or you can use a more message-based approach (more free-form) if you "describe" the resulting XML structure using the Web Service Description Language (WSDL, version 1.1, found at http://www.w3.org/TR/wsdl). This allows your Web service consumers to interpret the incoming XML message and react accordingly. Figure 2 shows a slightly revised business logic diagram.
Figure 2. Business logic using Web Service integration
A primary benefit of using the XML Web service for integration is that the Web service is readily available to any and all applications that might make a call to one of its Web methods. The Web service isn't tied to any particular application and it's ready for any application to use, allowing for solid reusability.
The primary drawback with the XML Web service is simply that the tradeoff for conversion to a common datatype typically takes more time than equivalent binary conversion, resulting in slightly more processing latency for the off-system call. Many applications will find this latency tolerable when contrasted against using Internet standard protocols and simplified administration. (Web services are simply Web applications themselves, easily administered via the Web server software in use.) Other applications may find this latency intolerable. (Systems that make many such interoperable calls when processing a single user request, or perhaps systems that must return information in a more narrowly constrained timeline.)
When all you have is a hammer, all the world looks like a nail. And given the popularity of XML Web services, as well as how simple it is to generate and consume one, it's easy to overuse the technology. The truth is, though, that XML Web services are but one tool you can use. They have a well-defined place in enterprise computing.
One place where they won't necessarily fit is where you have large amounts of data to transmit or have hard timing requirements for data processing. Data typically undergoes a significant size increase when moving from a binary representation into XML. A 32-bit integer, for example, consumes 4 bytes in your computer's memory but could consume anywhere from one to many bytes when converted into XML. For example,"-219857209" would require 10 bytes if encoded in single-byte text (UTF-8) and a whopping 20 bytes if encoded in multibyte form (UTF-16). This doesn't even count the XML tags!
If you combine this five-fold increase in size with another (roughly) 30-percent increase in size for Base64 MIME encoding, necessary when transmitting binary data using XML, you can see that sending SOAP XML responses of any complexity can result in very large XML files, which the remote server must process. If the server processes the SOAP XML using the XML Document Object Model (DOM), the size of the data in the server's memory commonly increases by an additional five-fold to allow for quick tree-based data access as well as XPath queries. (XPath is a query language for retrieving XML data from a DOM-based XML infoset, http://www.w3.org/TR/xpath.)
The bottom line is that data sent via an XML Web service can grow very quickly, which may be detrimental to your enterprise application's processing capability. It just depends upon the given situation.
Where the situation calls for large datasets to be issued to and from the Web service, or where speed is a concern, .NET remoting might be helpful. Why use .NET remoting when joining .NET and Java? Because the specification that governs .NET remoting was released to ECMA as part of the CLI specification (http://www.ecma-international.org/publications/standards/Ecma-335.htm), resulting in several Java remoting bridge products, including JNBridgePro (http://www.jnbridge.com/jnbpropg.htm) and JA.NET (http://ja.net.intrinsyc.com/). On the surface, a figure representing this situation might not appear that different from the Web service case (as shown in Figure 3).
Figure 3. Business logic using .NET remoting integration
However, the mechanics of the communication process differ significantly. For one thing, you don't have to process information through a Web server (although you can if you want). Instead, you can establish a direct socket connection using a binary formatting option for speed and efficiency. You won't be converting information into and out of XML, conserving processing time and memory, and you have a direct socket connection to and from the server into which to shove data very quickly.
The database integration technique isn't really new, but it is effective for many applications. Simply put, .NET and Java can interoperate at the database level where they process shared data. To the user, the application appears seamless, but in reality, the user is redirected between pages generated by either JSP/Struts or ASP.NET, depending upon the processing required (see Figure 4).
Figure 4. Shared database connection.
This architecture is commonly used when integrating existing applications where new business logic can be rolled into an existing application structure. Note also that the database in Figure 4 could also be replaced with message-queuing technology, such as MQ Series or MSMQ.
Today, there are a greater number of options available for integrating Java/J2EE and .NET into a single enterprise application/environment. While the goal should be to share information between these platforms, a greater goal should be to analyze the business need and apply the proper tool to the situation, whether Java/J2EE or .NET. The good news is that current technologies offer options, and the possibilities for interesting and effective systems are endless.
Some other material you may find interesting can be found at these locations:
Keith Organ's article, "Java/.NET Interoperability with the Microsoft.com Web Service" at http://msdn.microsoft.com/webservices/default.aspx?pull=/library/en-us/dnsvcinter/html/javanetmscom.asp
Simon Guest's book Microsoft .Net and J2ee Interoperability Toolkit, ISBN 0735619220
Microsoft's "Application Interoperability, Microsoft .NET and Java J2EE" (.pdf form) downloadable from http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/jdni.asp
DevX.com's Special Report: "Java/.NET Interop: Bridging Muddled Waters" http://www.devx.com/interop/Door/18896
Kenn Scribner is with Wintellect (http://www.wintellect.com/).