How to Link Different Data Sources Together

Introduction

In today's enterprises, different types of data are sitting in various data stores. Relational data is most likely in a SQL Server or Oracle database. Many enterprise applications have their own database with its own data and data model. User and account information resides in a Directory like Active Directory or Active Directory Application Mode. Smaller and more fluent and mobile data sits in Excel spreadsheets or Microsoft Access databases. Each enterprise has also large amounts of files like Word Documents, Excel Spreadsheets, HTML documents, and so forth that are sitting on a file system. It's common practice to use Microsoft Indexing Server to index all files sitting on the file system or a Web server. Microsoft Indexing Server creates an indexing catalog and provides also a SQL query interface that enables you to simply search the index.

The challenge enterprises are facing now is how to provide a complete and easy view into all this data sitting in various stores. Moreover, how can you search all this data in various data sources without having to create many different interfaces or create complex search mechanisms? The good news is that most of these data stores provide a SQL query language. The SQL language various slightly from data source to data source. For example, directories such as Active Directory provide a query capability but do not allow you to update, insert, or delete data through the SQL language. The same holds true for the SQL language of Microsoft Indexing Server. But, most data sources provide the basic SQL query capability. Moreover, Microsoft SQL Server allows you to link to other data stores that provide a SQL query language. This then allows you to create a single SQL query spawning multiple data sources. This article explains how you can achieve this.

Linking Other Data Stores to Microsoft SQL Server

Microsoft SQL Server provides the capability to link to other data sources that also provide a SQL query language. The Enterprise Manager of SQL Server shows in the left side navigation pane the "SQL Server groups" and under each all the database servers. It shows by default the local database server as "(local)". You can link other data stores to individual database servers. Expand a database server with the plus sign and you will find an entry called "Security." Expand the "Security" entry with the plus sign and you see an entry called "Linked Servers." Expand the "Linked Servers" entry to see all the data stores that have been linked to this database server.

To link a new data store, right-click on the "Linked Servers" entry and select "New Linked Server" from the popup menu. In the "Linked Server Properties" dialog, you enter the name for this linked server and then select whether you link another SQL Server database or any other data source. If you choose "SQL Server," the linked server name must be the name of the SQL Server database you want to link to. For example, if you have another SQL Server database running on the machine Enterprise-Minds,you would enter "Enterprise-Minds" as the linked server name. If you choose "Other data source," you need to select from a list of existing data source providers, for example "Microsoft Jet 4.0 OLE Provider." Based on which provider you choose, you need to enter the product name, data source, provider string, location, and catalog. When done, click Ok to create the link to this new data store. This new data source will now show up under "Linked Servers" with whatever name you entered.

Link Microsoft SQL Server to a Directory

You can link SQL Server to a directory such as Active Directory or Active Directory Application Mode. Create a linked server in Enterprise Manager and give it a descriptive name like "Active Directory Link." Choose the "Other data source" option and select the data provider "OLE DB Provider for Microsoft Directory Services." As a product name, enter "Active Directory Services" and as the data source, enter "adsdatasource." When you expand the newly linked server, you find two available items: Tables and Views. SQL Server allows you to view any tables and views in this data source. But, the data provider "OLE DB Provider for Microsoft Directory Services" provides neither tables nor views. Therefore, you will get the "Error 7301: Could not obtain a required interface from OLEDB provider ADsDSOObject" error. You can safely ignore the error message. You will be able to query the data store through the SQL query language.

It is important that you run the services "MSSQLSERVER" under an account that has access to the directory. If you run the services under the local system account, you will get the following error when you try to query the data store:

Server: Msg 7321, Level 16, State 2, Line 1
An error occurred while preparing a query for execution against
   OLE DB provider 'ADsDSOObject'.
OLE DB error trace [OLE/DB Provider 'ADsDSOObject'
   CommandPrepare::Prepare returned 0x80040e14].

If the directory runs on your local machine and the local user "DirectoryUser" has access to it, you need to run the "MSSQLSERVER" service under this account. If the directory is running somewhere in your domain and the domain user "DomainUser" has access to it, you need to run the "MSSQLSERVER" service under this domain account.

Link Microsoft SQL Server to a Microsoft Indexing Server catalog

You also can link SQL Server to an Indexing Server catalog. Create a new linked server and give it again a descriptive name—for example, WEBCATALOG if you link to the Web Indexing Server catalog. Next, select "Other data source" and choose "Microsoft OLE DB Provider for Indexing Service" as a data provider. Enter "Index Server" as the product name and, as the data source, the name of the Indexing Server catalog; for example, Web. When you expand the newly linked server, you find two available items: Tables and Views. SQL Server allows you to view any tables and views in this data source. But, the data provider "Microsoft OLE DB Provider for Indexing Service" provides neither tables nor views. Therefore, you will get the "Error 7301: Could not obtain a required interface from OLEDB provider MSIDXS" error. You can safely ignore the error message. You will be able to query the data store through the SQL query language.

Link Microsoft SQL Server to a Microsoft Access Database

SQL server also can be linked to a Microsoft Access database. Create a new linked server and, again, give it a descriptive name; for example, "Booklist." Next, select "Other data source" and choose "Microsoft Jet 4.0 OLE DB Provider" as the data provider. Enter "Access" as the product name and, as the data source, the path to the Microsoft Access file; for example, "c:\My files\Books.mdb." This data provider is able to show the list of tables and views. When you expand the Tables or Views item under this linked server, you will see the list of tables or views in the Microsoft Access database.

Link Microsoft SQL Server to a Microsoft Excel Spreadsheet

SQL server also can be linked to a Microsoft Excel spreadsheet. Create a new linked server and give it a descriptive name; for example, "Booklist." Next, select "Other data source" and choose "Microsoft Jet 4.0 OLE DB Provider" as the data provider. Enter "Jet 4.0" as the product name, as the data source, the path to the Microsoft Excel spreadsheet—for example "c:\My files\Books.xls"—and finally as the provider string enter "Excel 5.0." This data provider shows the list of excel worksheets under Tables. It does not show any Views.

There are many other providers available that allow you to link up SQL Server with a variety of different data sources. This link lists some of the available data providers that can be used by SQL server.

How to Query Linked Servers

Now, you have learned how to link SQL Server with a variety of different data stores. To query data from a linked server, you use the OPENQUERY command in your FROM clause, meaning you query data from a linked server instead of a standard SQL Server table or view. The OPENQUERY command requires two parameters: the name of the linked server followed by the query you want to execute against this linked server. The following example assumes you have a linked server with the name Books which itself has a table called Books:

SELECT * FROM OPENQUERY(Books, 'SELECT * FROM Books') AS Books

The query you pass along in the OPENQUERY command needs to be supported by the linked data provider. Depending on the data provider, it might vary slightly. You can join the data from any linked data source or SQL Server table together. The next example assumes that you query a linked data source and join it together with a SQL Server table called BookValue.

SELECT * FROM OPENQUERY(Books, 'SELECT * FROM Books')
         AS Books INNER JOIN
BookValue ON Books.ID = BookValue.ID

How to Link Different Data Sources Together

How to Query a Directory

The "OLE DB Provider for Microsoft Directory Services" accepts two different syntaxes. One is called the LDAP dialect and the other the SQL dialect. As its name suggests, the SQL dialect follows the SQL language syntax. You specify for the SELECT keyword the directory attributes you want to query for. For the ORDER BY and WHERE keywords, you specify the attributes to sort on and to filter by. For the FROM keyword, you specify which directory and directory container to query. The following example queries the container "OU=Enterprise-Minds,CN=Vancouver" in the Enterprise-Minds directory. It returns the common name and returns only directory objects of the type group.

SELECT * FROM OPENQUERY(ADAM,'SELECT cn
                        FROM ''LDAP://Enterprise-Minds/
                                      OU=Enterprise-Minds,
                        CN=Vancouver''
                        WHERE objectClass=''group'' ')
Note: The FROM clause needs to be put under two single quotes (escape sequencing of single quotes within a string that is already under single quotes). The following example queries the common name and ADS path for any directory object residing in the container "OU=Enterprise-Minds,CN=Vancouver" in the Enterprise-Minds directory.
SELECT * FROM OPENQUERY(ADAM,'SELECT cn, ADsPath
                        FROM ''LDAP://Enterprise-Minds/
                                      OU=Enterprise-Minds,
                        CN=Vancouver'' ')

The LDAP dialect consists of four parts, each separated by a semicolon. The first part specifies the directory and directory container, the second part the filter, the third part the list of attributes to return, and the last part the scope of the search. The scope part can have three values: Base, OneLevel, and SubTree. Base searches only the directory path you specify. OneLevel searches the immediate children of the directory path you specify. SubTree searches all descendants of the directory path you specify. The following example returns the same information as the first SQL dialect example—the list of all group objects:

SELECT * FROM OPENQUERY(ADAM,'<LDAP://Enterprise-Minds/
                                      OU=Enterprise-Minds,
                        CN=Vancouver>;
(objectClass=group);cn;subtree')

The next example returns the same information as the second SQL dialect example—the common name and ADS path of all directory objects:

SELECT * FROM OPENQUERY(ADAM,'<LDAP://Enterprise-Minds/
                                      OU=Enterprise-Minds,
                        CN=Vancouver>;;
cn,ADsPath;subtree')

You can find more information about the possible filter syntaxes in this article. The filter syntax applies for both dialects with the difference that values need to be put under single quotes for the SQL dialect. For example, you can see that the value group in the SQL dialect example is under single quotes (double single quotes for escape sequencing) whereas it is not for the LDAP dialect example.

How to Query Indexing Server Catalogs

The following example queries the file name, path, and virtual path for all files in the linked Web Indexing Server catalog.

SELECT * FROM OPENQUERY(WEBCATALOG, 'SELECT FileName, Path,
                        VPath FROM SCOPE()')

The next example queries for all files that contain the value "default" and reside under the "/Info" virtual directory and all its sub directories. It returns for all matches the file name, path, and virtual path.

SELECT * FROM OPENQUERY(WEBCATALOG, 'SELECT FileName, Path,
                        VPath FROM SCOPE(''DEEP
                                           TRAVERSAL OF("/Info")'')
         WHERE CONTAINS(FileName,''default'') ')
Note: The value in the SCOPE function is under two single quotes for escape sequencing whereas the value of the DEEP TRAVERSAL OF function is under double quotes according to the Indexing Server SQL syntax. As shown, you can create very complex queries.

How to Query Excel Spreadsheets

All worksheets present in a linked Excel spreadsheet are shown as a table. Note that all table names show a $ sign at the end. Also, if the worksheet name has spaces in it, you see the table name surrounded by single quotes that you always drop in your query. The table name in your query string needs to be surrounded by square brackets. The following example queries for all entries in the worksheet called Summary.

SELECT * FROM OPENQUERY(Books, 'SELECT * FROM [Summary$]')

It is important that the Excel spreadsheet is not open in Excel; otherwise, you will get the following error:

Server: Msg 7399, Level 16, State 1, Line 1
OLE DB provider 'Microsoft.Jet.OLEDB.4.0' reported an error.
   The provider did not give any information about the error.
OLE DB error trace [OLE/DB Provider 'Microsoft.Jet.OLEDB.4.0'
   IDBInitialize::Initialize returned 0x80004005:
   The provider did not give any information about the error.].

Summary

This article describes how you can link SQL Server with many different data sources. It describes step by step how to link SQL Server to a directory such as Active Directory Application Mode, a Microsoft Indexing Server catalog, a Microsoft Access database, and a Microsoft Excel spreadsheet. It also explains how you can query linked servers using the OPENQUERY command of SQL. Finally, it explains the query syntax for directory queries, Indexing Server catalog queries, and Excel spreadsheet queries. Queries against a Microsoft Access database use the same SQL syntax as you are used to from SQL Server.

Microsoft SQL server provides a powerful way to link up many different data stores and provide one common query interface. This makes it very easy for applications to provide a common query capability against most data stores in the enterprise. This capability is not widely known in the developer community. Taking advantage of it can simplify your application tremendously if you need to query many different data sources and provide one common query interface and search result. If you have comments on this article or this topic, please contact me @ klaus_salchner@hotmail.com. I want to hear if you learned something new. Contact me if you have questions about this topic or article.

About the Author

Klaus Salchner has worked for 14 years in the industry, nine years in Europe and another five years in North America. As a Senior Enterprise Architect with solid experience in enterprise software development, Klaus spends considerable time on performance, scalability, availability, maintainability, globalization/localization,and security. The projects he has been involved in are used by more than a million users in 50 countries on three continents.

Klaus calls Vancouver, British Columbia his home at the moment. His next big goal is running the New York marathon in 2006. Klaus is interested in guest speaking opportunities or as an author for .NET magazines or Web sites. He can be contacted at klaus_salchner@hotmail.com or http://www.enterprise-minds.com.

Enterprise application architecture and design consulting services are available. If you want to hear more about it, contact me! Involve me in your projects and I will make a difference for you. Contact me if you have an idea for an article or research project. Also, contact me if you want to co-author an article or join future research projects!



Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • On-demand Event Event Date: December 18, 2014 The Internet of Things (IoT) incorporates physical devices into business processes using predictive analytics. While it relies heavily on existing Internet technologies, it differs by including physical devices, specialized protocols, physical analytics, and a unique partner network. To capture the real business value of IoT, the industry must move beyond customized projects to general patterns and platforms. Check out this webcast and join industry experts as …

  • On-demand Event Event Date: October 29, 2014 It's well understood how critical version control is for code. However, its importance to DevOps isn't always recognized. The 2014 DevOps Survey of Practice shows that one of the key predictors of DevOps success is putting all production environment artifacts into version control. In this webcast, Gene Kim discusses these survey findings and shares woeful tales of artifact management gone wrong! Gene also shares examples of how high-performing DevOps …

Most Popular Programming Stories

More for Developers

RSS Feeds