How to Integrate Google Searches into Your Application

Introduction

The first thing that comes to mind when you hear Google is search engine. Google has been able to turn the search business upside down within the last five years. The founders of Google started with an idea in 1995; it really became widely used and known in 1998/99. Today, Google is the number one search engine. You can find out more about Google's history here. Like other organizations, Google is trying to establish itself as a platform rather then a solution. This means it provides the necessary tools and infrastructure so that other people can build their own solutions on top of it. Google provides a Web service interface that allows you to integrate Google searches right into your application. You can find out more about the Google Web service API at http://www.google.ca/apis.

Getting Started with the Google API

You can download from the URL above the developer's kit that comes with a number of sample applications for different languages, such as .NET or Java. You also need a valid license key, which you need to pass along with every web service call. To obtain a Google license key, visit http://www.google.ca/apis and select "Create Account" on the left side navigation bar. You need to create an account by entering your e-mail address and a password. This sends an e-mail to the e-mail address you entered to verify its existence. The e-mail you receive has a link to complete the account creation by activating it. When done, click the continue link; it will bring you back to the account creation page. At the bottom of the page you see a link, "sign in here." Follow the link and sign into your account with your e-mail address and password. This then shows a page confirming that a license key has been generated and sent to your e-mail address. Should you lose your license key, sign in again and Google will resend the license key to your e-mail address. The license key is for free but limits you to 1,000 calls per day. This will be more then enough to get started. If you need to make more then 1,000 calls per day, contact Google.

Referencing the Google Web Service API in Your Project

Create your project in Visual Studio .NET and in the "solution explorer" pane right-click the project. In the popup menu, select "Add Web Reference" and enter the following WSDL URL as the URL—http://api.google.com/GoogleSearch.wsdl. This will check the existence of the WSDL, download it, and show you in the dialog the Web methods available by this Web service. Enter the name of the web service reference under "web reference name;" for example, GoogleSearch. When done, click "Add Reference" and you are ready to use the Google web service API. It will be shown in the "solution explorer" under "Web References." You can right-click the Web service reference and update it through the "Update Web Reference" menu item or view it in the object explorer through the "View in Object Browser" popup menu. This shows you that there are four different types available. The GoogleSearchService type exposes the actual Web service calls you can make. It has three different Web methods (plus the usual Begin/End methods if you want to call a Web method asynchronously).

GoogleSearchService.doSpellingSuggestion()

When you open Google in your browser and search for a word or phrase, you sometimes see the phrase "Did you mean: [suggested search term]" at the top of the search results page. Google performs a spell check of the search term you entered and then shows you alternative spellings of your search term. This helps the user to search for properly spelled words and phrases and the user can simply click it to search for the corrected search term. The Google Web service also provides a Web method to check for alternate spellings of a search term. Here is a code snippet:

public static string SpellingSuggestion(string Phrase)
{
   // create an instance of the Google Web service
   Google.GoogleSearchService GoogleService = new
      Google.GoogleSearchService();

   // get the new spelling suggestion
   string SpellingSuggestion =
      GoogleService.doSpellingSuggestion(Key, Phrase);

   // null means we have no spelling suggestion
   if (SpellingSuggestion == null)
      SpellingSuggestion = Phrase;

   // release the Web service object
   GoogleService.Dispose();
   return SpellingSuggestion;
}

First, you create an instance of the Web GoogleSearchService class and then you call the doSpellingSuggestion() Web method. The first argument is the Google license key you pass along and the second one is the search term. The Web method returns the alternate spelling of the search term or null if there is no alternate spelling. The code snippet above returns the alternate spelling or the original one. At the end, it calls Dispose() to free up the underlying unmanaged resource.

GoogleSearchService.doGetCachedPage()

Google is constantly crawling the Internet to keep its search index and directory up to date. Google's crawler also caches the content locally on its servers and allows you to obtain the cached page, which is the content as of when the crawler visited that resource the last time. URLs can point to many different resources, most typically to HTML pages. But, these also can be Word documents, PDF files, PowerPoint slides, and so forth. The cached page is always in HTML format. So, for any other resources than HTML, it also converts the format to HTML. Here is a code snippet:

public static void GetCachedPageAndSaveToFile(string PageUrl,
                                              string FileName)
{
   // create an instance of the Google Web service
   Google.GoogleSearchService GoogleService = new
      Google.GoogleSearchService();

   // get the cached page content
   byte[] CachedPage = GoogleService.doGetCachedPage(Key, PageUrl);

   // file writer to write a stream to the file & a binary writer
   // to write data to
   FileStream FileWriter = new FileStream(FileName, FileMode.Create);
   BinaryWriter Writer   = new BinaryWriter(FileWriter);

   // write the page content to the file and close the streams
   Writer.Write(CachedPage);
   Writer.Close();
   FileWriter.Close();

   // release the Web service object
   GoogleService.Dispose();
}

First, you again create an instance of the GoogleSearchService class and then you call the doGetCachedPage() Web method. You pass along the Google license key plus the URL of the page you are looking for. This returns a byte array, using base64 encoding, which contains the HTML content of the cached page. Next, you create a FileStream that you use to write the obtained page to a local file. With FileMode.Create, you tell it to create the file; this overwrites any existing file. Then, you create a BinaryWriter that uses the FileStream as output. Then, you write the returned byte array to the BinaryWriter which in turn writes it to the FileStream, which in turn writes it to the local file. Then, you close the FileStream and BinaryWriter. At the end, you again call Dispose() to free up underlying unmanaged resources.

How to Integrate Google Searches into Your Application

GoogleSearchService.doGoogleSearch()

The doGoogleSearch() Web method allows you to perform searches. You pass along the search term and then certain filter criteria to filter the content; for example, to a specific country, language, topic, and so on. Here are the arguments you pass along to the Web method:

  • Key—The Google license key.
  • QueryTerm—The actual search term. This can be a simple word, a phrase (to search for the phrase, you need to put it under double quotes; otherwise, it searches for the occurrence of all individual words), a list of words (you can use the AND or OR operator; when no operator is used between the words, AND is assumed), and the like. You can also exclude words or phrases by putting a minus sign in front of it. The Google reference at http://www.google.ca/apis/reference.html explains all query term capabilities.
  • Start—A zero-based index of the first result to be returned. This allows you to page through the result set. The search result returned by this Web method cannot be more then MaxResults; therefore, you need to make multiple calls and set Start appropriately to get the next results, and so forth. If you provide a user interface that allows the user to page through the complete result set, you would set Start accordingly to return the results for each page. For example, the first call would set it to 0, the next to 11, followed by 21, and so on (assuming that MaxResults is set to 10).
  • MaxResults—The maximum number of results to be returned by the query. This can be a value between 1 and 10.
  • Filter—When set to true, it filters out duplicate or near-duplicate search results. Near-duplicate results are results with the same title and snippets (snippet is the summary text shown for each search result). This also limits the number of search results coming from the same host. So, if a Web site would return ten records matching the search term, this would only return the first two (called host crowding).
  • Restricts—Allows you to restrict the search to results from one or more countries or one or more topics. For example, you can restrict the search to content within the US by setting this value to "countryUS". You can restrict the search to content centered around Linux by setting this value to "linux". The Google reference at http://www.google.ca/apis/reference.html lists all the possible values.
  • SafeSearch—Filters out adult content when set to true.
  • LanguageRestrict—Allows you to restrict the search within one or more languages. The Google reference at http://www.google.ca/apis/reference.html lists all the possible values.
  • InputEncoding—This value is ignored. All requests should be encoded using UTF-8.
  • OutputEncoding—This value is ignored. All returned results are encoded using UTF-8.

This Web method allows you to perform simple or complex search queries against Google. It also allows you to filter the search result as well as page through the search result. Here is a code snippet:

public static XmlNode Search(string QueryTerm, int Start,
                             int MaxResults, bool Filter,
                             string Restricts, bool SafeSearch,
                             string LanguageRestrict,
                             string InputEncoding,
                             string OutputEncoding)
{
   // create an instance of the Google Web service
   Google.GoogleSearchService GoogleService = new
      Google.GoogleSearchService();

   // perform search
   Google.GoogleSearchResult SearchResult =
      GoogleService.doGoogleSearch(Key, QueryTerm, Start, MaxResults,
                                   Filter, Restricts, SafeSearch,
                                   LanguageRestrict, InputEncoding,
                                   OutputEncoding);

   // return the result back as a XML document
   XmlDocument ResultXml = CreateXmlDocument(SearchResultXmlNode);

   // add the search result
   StringValueOfObject(ResultXml.DocumentElement, SearchResult);

   // add the result elements and directory categories root node
   XmlElement ResultElementsParentNode =
      AddChildElement(ResultXml.DocumentElement, "ResultElements");
   XmlElement CategoriesParentNode =
      AddChildElement(ResultXml.DocumentElement, "DirectoryCategories");

   // now add all result elements
   foreach (Google.ResultElement ResultElement in
            SearchResult.resultElements)
      StringValueOfObject(ResultElementsParentNode, ResultElement);

   // now add all directory categories
   foreach (Google.DirectoryCategory DirectoryCategory in
            SearchResult.directoryCategories)
      StringValueOfObject(CategoriesParentNode, DirectoryCategory);

   // release the Web service object
   GoogleService.Dispose();
   return ResultXml;
}

First, you create an instance of the GoogleSearchService class and then call the web method doGoogleSearch(). You pass along all the arguments as described above. This performs the search and returns its result as an instance of the GoogleSearchResult class. The code snippet then takes all values of the GoogleSearchResult object and puts them into a XML document. Please refer to the attached sample application for the complete code. First, it creates an XML document with the CreateXmlDocument() method. It then calls the StringValueOfObject() method that creates an XML element for the object in the XML document using the name of the object as the name of the XML element. The method uses then reflection to walk the returned GoogleSearchResult object and, for each field it finds in the object, it adds an attribute to the created XML element. It of course adds to each created attribute the value of the associated object field. The returned GoogleSearchResult object has two fields that hold an array of ResultElement and DirectoryCategory objects. The StringValueOfObject() method is not able to walk each object in those arrays. Therefore, you create two root XML elements in the XML document using the method AddChildElement(). You then loop through both arrays and call for each object StringValueOfObject() so you can convert each object to an XML element adding all its fields as attributes. Finally, you call Dispose() again to free up the underlying unmanaged resources and then return the XML document that contains all search information of the GoogleSearchService object. This enables you to run XPath queries against the search result XML document to find the required search result information.

The Attached Sample Application

The attached sample application provides a wrapper class for all Google Web methods. It also provides a simple user interface demonstrating the use of each Web method. You can enter a search term and get alternate spelling suggestions, you can download the cached HTML page of a URL and display it, and you can perform a search by entering all the search arguments. Please make sure to obtain your own Google license key and enter it in the app.config file.

Summary

The Google Web service API is very easy to use. It enables you to search the Internet from within your application. Complex query terms and filtering capabilities assure relevancy of the search results to your application needs. The Google Web service is one of many other emerging ones, such as Amazon's Web service or eBay's Web service. By introducing a Web service interface, these companies moved to a platform, enabling third parties to build solutions on top of them. For these companies, an ever increasing number of requests and business transactions are coming through these Web service interfaces. If you have comments on this article or this topic, please contact me @ klaus_salchner@hotmail.com. I want to hear if you learned something new. Contact me if you have questions about this topic or article.

About the Author

Klaus Salchner has worked for 14 years in the industry, nine years in Europe and another five years in North America. As a Senior Enterprise Architect with solid experience in enterprise software development, Klaus spends considerable time on performance, scalability, availability, maintainability, globalization/localization, and security. The projects he has been involved in are used by more than a million users in 50 countries on three continents.

Klaus calls Vancouver, British Columbia his home at the moment. His next big goal is running the New York marathon in 2005. Klaus is interested in guest speaking opportunities or as an author for .NET magazines or Web sites. He can be contacted at klaus_salchner@hotmail.com or http://www.enterprise-minds.com.

Enterprise application architecture and design consulting services are available. If you want to hear more about it, contact me! Involve me in your projects and I will make a difference for you. Contact me if you have an idea for an article or research project. Also contact me if you want to co-author an article or join future research projects!



Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Cisco and Intel have harnessed flash memory technology and truly innovative system software to blast through the boundaries of today's I/O-bound server/storage architectures. See how they are bringing real-time responsiveness to data-intensive applications—for unmatched business advantage. Sponsored by Cisco and Intel® Partnering in Innovation

  • With 81% of employees using their phones at work, companies have stopped asking: "Is corporate data leaking from personal devices?" and started asking: "How do we effectively prevent corporate data from leaking from personal devices?" The answer has not been simple. ZixOne raises the bar on BYOD security by not allowing email data to reside on the device. In addition, Zix allows employees to maintain complete control of their personal device, therefore satisfying privacy demands of valued employees and the …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds