Analyzing Image Content Programmatically: Using the Microsoft Cognitive Vision API

WEBINAR: On-demand webcast

How to Boost Database Development Productivity on Linux, Docker, and Kubernetes with Microsoft SQL Server 2017 REGISTER >

Introduction

Microsoft Cognitive Computer Vision APIs are Azure, Cloud-based, powerful REST services. These APIs provide developers access to advanced machine learning algorithms for processing images and returning back the image analysis in a JSON-structured format. Cognitive Vision APIs help developers to build powerful intelligence into the applications to enable natural and contextual interactions. Computer Vision APIs could be used for information such as image description, recognizing celebrities, reading text from images, and generating thumbnails. The Computer Vision API can tag images according to their content.

Using cognitive computer vision API's developer can perform the following tasks, among others:

  • Analyze information about the visual content found in an image
  • Categorize images
  • Generate thumbnails
  • Identify the type and quality of images
  • Recognize celebrities
  • Detect human faces in a image
  • Recognize text present in a image and read it
  • Flag adult contents
  • Utilize optical character recognition to identify printed text found in images

How to Register and Get the API Key

To consume the Vision API, the first step is to obtain the API keys from Microsoft Cognitive Services, which is deployed in the Azure Cloud. Currently, a free plan that limits calls to 5000 transactions per month is available.

Open the preceding link and create a new account by clicking the Create button.

Creating a new account
Figure 1: Creating a new account

During the new Cognitive Account creation process, you have to accept the Cognitive Service terms and conditions.

Accepting the account terms and conditions
Figure 2: Accepting the account terms and conditions

Sign in, using your existing Microsoft account. You also can use your Facebook, Linkedin, or Git account to create an account.

Logging in to your account
Figure 3: Logging in to your account

After completing the sign-up process, locate the Computer Vision section; two keys will be provided to you. You have to pass these keys with the service call to execute the transaction.

Receiving the two keys
Figure 4: Receiving the two keys

Computer Vision API Details

The Computer Vision API is currently available in the following Azure zones:

  • West US: westus.api.cognitive.microsoft.com
  • East US 2: eastus2.api.cognitive.microsoft.com
  • West Central US: westcentralus.api.cognitive.microsoft.com
  • West Europe: westeurope.api.cognitive.microsoft.com
  • Southeast Asia: southeastasia.api.cognitive.microsoft.com

A developer can either (1) upload an image or (2) specify an image URL in the HTTP post request. There is an optional parameter that allows the developer to choose which features to return. Supported image formats are JPEG, PNG, GIF, and BMP. The maximum image size is 4MB and the image dimension should be greater than 50 x 50 pixels.

A successful response will be returned in JSON format with response code 200. If the request failed, error code 400, 415, or 500 will be returned.

The Computer Vision API HTTP Post request format is as follows.

https://[location].api.cognitive.microsoft.com/vision/v1.0/
   analyze[?visualFeatures][&details][&language]
   [ &subscription-key=<Your subscription key]

Following is an example of an HTTP post request with a valid subscription key.

POST https://westus.api.cognitive.microsoft.com/vision/v1.0/
   analyze?visualFeatures=Categories&language=en HTTP/1.1
Content-Type: application/json
Host: westus.api.cognitive.microsoft.com
Ocp-Apim-Subscription-Key: ################################
{"url":"https://media.licdn.com/mpr/mpr/
   shrinknp_200_200/building.Jpeg "}

Editor's Note: The "#" signs were used to replace "hard spaces" in the original text.

Here is a sample successful JSON response with success code 200.

{
   "categories": [
      {
         "name": "building_",
         "score": 0.31640625,
         "detail": {
            "landmarks": [
               {
                  "name": "Colosseum",
                  "confidence": 0.944500566
               }
               ]
         }
      },
      {
         "name": "others_",
         "score": 0.00390625
      },
      {
         "name": "outdoor_",
         "score": 0.04296875
      }
   ],
   "tags": [
      {
         "name": "building",
         "confidence": 0.99887830018997192
      },
      {
         "name": "outdoor",
         "confidence": 0.97255456447601318
      }
   ],
   "description": {
      "tags": [
         "building",
         "outdoor",
         "front",
         "sitting",
         "large",
         "old",
         "standing",
         "table",
         "top",
         "train",
         "bridge",
         "city",
         "group",
         "white",
         "man",
         "clock",
         "walking",
         "people",
         "parked",
         "track",
         "castle",
         "sheep",
         "riding",
         "tower",
         "street",
         "tall"
      ],
      "captions": [
         {
            "text": "a group of people in front
                     of a building",
            "confidence": 0.84632025454882787
         }
      ]
   },
   "requestId": "7e38d717-52b0-4947-ae9a-2210ee036dbd",
   "metadata": {
      "width": 600,
      "height": 399,
      "format": "Jpeg"
   },
   "faces": [],
   "color": {
      "dominantColorForeground": "Grey",
      "dominantColorBackground": "White",
      "dominantColors": [
         "Grey",
         "White"
       ],
      "accentColor": "486A83",
      "isBWImg": false
   },
   "imageType": {
      "clipArtType": 0,
      "lineDrawingType": 0
   }
}

Following is an example of a failed JSON response with error code 401 because an invalid subscription key was passed in the request URL.

apim-request-id: 1bed0251-5c8d-4bc0-8cc9-797ecabd14d2
Strict-Transport-Security: max-age=31536000;
   includeSubDomains; preload
x-content-type-options: nosniff
Date: Sun, 21 May 2017 06:03:13 GMT
WWW-Authenticate: AzureApiManagementKey
   realm="https://westus.api.cognitive.microsoft.com/
   vision/v1.0",name="Ocp-Apim-Subscription-Key",
   type="header"
Content-Length: 143
Content-Type: application/json
{ "statusCode": 401,
   "message": "Access denied due to invalid subscription key.
      Make sure to provide a valid key for an active
      subscription."}

Programmatically Analyze an Image with the Vision API Using C#

The following C# console application will demonstrate how to retrieve image features in JSON format, such as Image properties, tags, and description from a selected image using the Cognitive Service Computer Vision API.

The following tools/software are required to develop the console application.

  • Windows 8 or higher version
  • Free Visual Studio 2015 Community Edition
  • Cognitive Service Computer Vision API key

Step 1

Open Visual Studio 2015 -> Start -> New Project-> Select Templates (under Visual C# -> Console Application) -> Blank Application -> Give suitable name for your App (ComputerVisionAPI) -> OK. See Figure 5.

Starting a new console application
Figure 5: Starting a new console application

Step 2

Add the following namespaces in the Program.cs file.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Configuration;

Step 3

Rename Program.cs to ComputerVisionAPI.cs. Accordingly, change the name of the class to ComputerVisionAPI.

Step 4

Add the app.config file to the project and update the app settings with the following keys. Provide a local image path or a URL. Update the subscript key value you have generated from the registration process. Request parameters and API URI values will be changed based on image operation.

<?xml version="1.0" encoding="utf-8"?>
<configuration>
<startup><supportedRuntime version="v4.0"
      sku=".NETFramework,Version=v4.6.1"/></startup>
   <appSettings>
      <add key="ImagePath" value="C:\\Users\Tapas\\
         Desktop\\Docs\\BB Backup\\SampleImage.jpg"/>
      <add key="RequestParameters" value="visualFeatures=
         Categories&language=en"/>
      <add key="APIuri" value="https://westus.api.cognitive.
         microsoft.com/vision/v1.0/analyze?"/>
      <add key="Subscription-Key"
         value="13hc77781f8f6cc9b5fcdd72a8df7156"/>
      <add key ="Contenttypes" value="application/json"/>
      <!-- example uses content type "application/octet-stream".
      The other content types you can use are
         "application/json" and "multipart/form-data".-->
   </appSettings>
</configuration>

Step 5

The following static methods will return key values from the app.config file.

static string Subscriptionkey()
   {
      return System.Configuration.ConfigurationManager.
         AppSettings["Subscription-Key"];

   }
static string RequestParameters()
   {
      return System.Configuration.ConfigurationManager.
         AppSettings["RequestParameters"];
 
   }
static string ReadImagePath()
   {
      return System.Configuration.ConfigurationManager.
         AppSettings["ImagePath"];
 
   }
static string ReadURI()
   {
      return System.Configuration.ConfigurationManager.
         AppSettings["APIuri"];
 
   }
static string Contenttypes()
   {
      return System.Configuration.ConfigurationManager.
         AppSettings["Contenttypes"];
 
   }

Step 6

For image processing and calling the API, write the following static functions in the ComputerVisionAPI class.

static byte[] GetImageAsByteArray(string ImagePath)
{
   FileStream ImagefileStream = new FileStream(ImagePath,
      FileMode.Open, FileAccess.Read);
   BinaryReader ImagebinaryReader = new
      BinaryReader(ImagefileStream);
   return ImagebinaryReader.ReadBytes
      ((int)ImagefileStream.Length);
}
<summary>
/// Use the following function to fetch all image-related details
/// </summary>
/// <param name="ImagePath"></param>
static  void GetImgeDetails(string ImagePath)
{
   var ComputerVisionAPIclient = new HttpClient();
 
   // Request headers -   
   replace this example key with your valid subscription key.
      I have added that in App.config
   ComputerVisionAPIclient.DefaultRequestHeaders.Add
      ("Ocp-Apim-Subscription-Key", Subscriptionkey());
 
   // Request parameters.
   string requestParameters = RequestParameters();
   string APIuri = ReadURI() + requestParameters;
         
   // Request body. 
   byte[] ImagebyteData = GetImageAsByteArray(ImagePath);
 
   ImgeAnalysis(ImagebyteData, APIuri,
      ComputerVisionAPIclient);
 
           
}
/// <summary>
/// The following function calls the computer vision API and
/// displays the response in the Console
/// </summary>
/// <param name="ImagebyteData"></param>
/// <param name="uri"></param>
/// <param name="ComputerVisionAPIclient"></param>
static async void ImgeAnalysis(byte[] ImagebyteData, string APIuri,
   HttpClient ComputerVisionA PIclient)
 
{
   HttpResponseMessage APIresponse;
   var Imagecontent = new ByteArrayContent(ImagebyteData);
   Imagecontent.Headers.ContentType = new
      MediaTypeHeaderValue(Contenttypes());
   APIresponse = await ComputerVisionAPIclient.PostAsync
      (APIuri, Imagecontent);
   Console.WriteLine(APIresponse);
   Console.Read();

}

Step 7

Finally, call the GetImgeDetails function to get image details from Main().

static void Main(string[] args)
{
   GetImgeDetails(ReadImagePath());

}

Figure 6 shows the C# console application.

The application running as a C# console application
Figure 6: The application running as a C# console application

Successful execution of the program will generate JSON output depicted in the Computer Vision API Details section. The developer could write code to parse the JSON response output and build an intelligent app.

Conclusion

The Computer Vision API has solved the problem known as object recognition inside an image. Currently, the API recognizes about 2000 distinct objects and groups them into 87 categories. In this article of the Cognitive API series, you have learned what the Computer Vision API is and its offering to you as a developer. You will get a closer look at the other APIs and code walkthrough in my upcoming posts.



About the Author

Tapas Pal

Tapas Pal is a Microsoft Platform technical professional with Tata Consultancy Services, India. He has with seven years of experience, holds Microsoft certifications in .NET 1.1 and .NET 2.0.

Related Articles

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • As all sorts of data becomes available for storage, analysis and retrieval - so called 'Big Data' - there are potentially huge benefits, but equally huge challenges...
  • The agile organization needs knowledge to act on, quickly and effectively. Though many organizations are clamouring for "Big Data", not nearly as many know what to do with it...
  • Cloud-based integration solutions can be confusing. Adding to the confusion are the multiple ways IT departments can deliver such integration...

Most Popular Programming Stories

More for Developers

RSS Feeds

Thanks for your registration, follow us on our social networks to keep up-to-date