Obtaining Reliability and Scalability in Azure Service Fabric

In my previous article, “Understanding Azure Service Fabric: Microsoft’s Next Generation PaaS Platform,” I explained the fundamental concepts of the Azure Service Fabric platform and the problems it is trying to address around PaaS. Now, I will cover some of the advanced concepts: achieving high level of reliability and scalability for any enterprise application using Service fabric platform.

Reliability in Azure Service Fabric (ASF)

Reliability of any enterprise application is measured by the ability to perform its intended functions in a system’s environment without experiencing any interruptions from failures. These failures can be application issues or system outage.

Service fabric improves reliability by adding redundancy into the application deployments over multiple nodes (Virtual Machines). The service fabric cluster is the group of five or more VMs that provide a guarantee against node-level failures. This reliability helps in achieving higher availability for our business applications.

Service Fabric Provides the Following Two Ways to Create Reliable Services:

For Stateless Services: A Service that doesn’t hold any state between its requests and responses are known as stateless. In these kinds of services, we need either caching or external storage to hold the state. In service fabric, configure two or more instances for any stateless service. These instances are automatically load balanced. Each of these instances, which are nothing but application logic, will be deployed to different nodes in the cluster. If any instance failure is detected, runtime creates a new instance automatically on another node in the same cluster.

Creating stateless service with multiple instances:

Using .Net API:

Var serviceNSFormat = "fabric:/{0}/{1}";
var AppNsFormat = "fabric:/{0}";
var serviceType = "NumberCountingServiceType";

var svcNS = string.Format(serviceNSFormat,
   appName, svcName);
var serviceDescription = new StatelessServiceDescription();
serviceDescription.ApplicationName =
   new Uri(string.Format(AppNsFormat, appName));
serviceDescription.InstanceCount = 3;
serviceDescription.PartitionSchemeDescription =
   new SingletonPartitionSchemeDescription();
serviceDescription.ServiceName = new Uri(svcNS);
serviceDescription.ServiceTypeName = serviceType;

//Create the service instance
var fabricClient = new FabricClient("localhost:19000");
fabricClient.CreateService(serviceDescription);

Using Powershell script:

New-ServiceFabricService
   -ApplicationName fabric:/NumberCounterApp
   -ServiceName fabric:/NumberCounterApp/
      NumberCountingService
   -ServiceTypeName NumberCountingServiceType
   -Stateless
   -PartitionSchemeSingleton
   -InstanceCount 3

For Stateful Services: In Service Fabric, a stateful service is modeled as a set of one primary and many active secondary replicas. These replicas consist of two things: an instance of the application code and the state on the VM. This co-location of code and state data makes it powerful as it results in low latency. All the data read and write operations are performed at primary replica, and which get replicated to the active secondary. So, to achieve high reliability with stateful services, configure two or more replicas for the service. If any replica goes down, service fabric runtime automatically replaces it with a new replica on another node. In case the primary fails, the secondary replica takes over as the primary and a brand new replica gets added as a secondary.

Creating service with multiple replicas:

Using .Net API:

Var serviceNSFormat = "fabric:/{0}/{1}";
var AppNsFormat = "fabric:/{0}";
var serviceType = "NumberCountingServiceType";
var svcNS = string.Format(serviceNSFormat,
   appName, svcName);

StatefulServiceDescription serviceDescription =
   new StatefulServiceDescription();
serviceDescription.ApplicationName =
   new Uri(string.Format(appNSFormat, appName));
serviceDescription.PartitionSchemeDescription =
   new SingletonPartitionSchemeDescription();
serviceDescription.ServiceName = new Uri(svcNS);
serviceDescription.ServiceTypeName = serviceType;
serviceDescription.HasPersistedState = true;
serviceDescription.MinReplicaSetSize = 2;
serviceDescription.TargetReplicaSetSize = 3;

//Create the service instance
var fabricClient =
   new FabricClient("localhost:19000");
fabricClient.CreateService(serviceDescription);

Using Powershell script:

New-ServiceFabricService
   -ApplicationName fabric:/NumberCounterApp
   -ServiceName fabric:/NumberCounterApp/
      NumberCountingService
   -ServiceTypeName NumberCountingServiceType
   -Stateful
   -HasPersistedState
   -PartitionSchemeSingleton
   -TargetReplicaSetSize 3
   -MinReplicaSetSize 2

Scalability in Azure Service Fabric

Scalability is one of the most important attributes of any business application. A truly scalable application can not only perform its intended functions optimally at varying loads but also utilize the full potential of available computing resources. The varying loads can be larger data-sets, higher user traffic, or a combination of data size and velocity. A system’s scalability can be achieved in two ways:

  • “Scale up”: Vertical scaling by using stronger or larger hardware
  • “Scale out”: Horizontal scaling by adding more numbers of smaller hardwares

Scalability Options for Both of the ASF Service Types:

For Stateless services: These kind of services can be scaled out by defining higher number of service instance counts (two or more). Each of these load-balanced instances gets deployed to different nodes in the cluster. Although stateless services can also be partitioned, it is rare.

For Stateful Services: Stateful services can divide the load among its partitions or named service instances. The partitions or named service instances are just the separate service instances running with replicas on various nodes in the clusters. Each of the partitions works on a subset of the total state managed by the stateful service. The client can implement required logic to identify the correct partition while calling the service. This partitioning scheme brings the power of parallelism to the service so that many requests can be handled by different partitions at the same. This partitioning can be achieved in two ways:

By named service instances: A service instance is a specific named instance of a service type deployed to ASF. The first level of scaling is achieved by service names. More numbers of named instances of a service can be created with different levels of partitioning to handle a higher load on the application.

Creating named service instances:

Services can be created using the .NET or Powershell methods, as explained earlier, for various service Uris as named instances for any given service type:

Connect-ServiceFabricCluster "localhost:19000"
New-ServiceFabricService
   -ApplicationName fabric:/NumberCounterApp
   -ServiceName
      fabric:/NumberCounterApp/NumberCountingService/1
   -ServiceTypeName NumberCountingServiceType
   -Stateful
   -HasPersistedState
   -PartitionSchemeSingleton
   -TargetReplicaSetSize 1
   -MinReplicaSetSize 1
New-ServiceFabricService
   -ApplicationName fabric:/NumberCounterApp
   -ServiceName
      fabric:/NumberCounterApp/NumberCountingService/2
   -ServiceTypeName NumberCountingServiceType
   -Stateful -HasPersistedState
   -PartitionSchemeSingleton
   -TargetReplicaSetSize 1
   -MinReplicaSetSize 1
New-ServiceFabricService
   -ApplicationName fabric:/NumberCounterApp
   -ServiceName
      fabric:/NumberCounterApp/NumberCountingService/3
   -ServiceTypeName NumberCountingServiceType
   -Stateful
   -HasPersistedState
   -PartitionSchemeSingleton
   -TargetReplicaSetSize 1
   -MinReplicaSetSize 1

By implementing a partitioning scheme for the service: There are three kinds of partitioning schemes that are available in stateful services:

Singleton: This indicates that the service doesn’t need partitioning.

<StatefulService ServiceTypeName=
      "NumberCountingServiceType"
      TargetReplicaSetSize="3"
      MinReplicaSetSize="2">
   <SingletonPartition />
</StatefulService>

Named: The service load can be grouped into subsets identified by a predefined name. These names can be used to partition the service. The client can look up the individual partitions by their name.

<StatefulService
      ServiceTypeName="SimulationServiceType"
      TargetReplicaSetSize="3"
      MinReplicaSetSize="2">
   <NamedPartition>
      <Partition Name="Tenant_A" />
      <Partition Name="Tenant_B" />
   </NamedPartition>
</StatefulService>

Client code to retrieve specific partition:

Create new constructor for NumCountingSvcClient: ServicePartitionClient<…>:

public NumCountingSvcClient(
      WcfCommunicationClientFactory<INumberCounter>
         clientFactory,
      Uri serviceName,
      string partitionKey)
      : base(clientFactory, serviceName,
         partitionKey)
   {
   }

Var client = new NumCountingSvcClient(new
      WcfCommunicationClientFactory<INumberCounter>
   (serviceResolver, binding, null, null),ServiceName,
      "Tenant_A");

Ranged Partitions: The service load is divided into partitions identified by integer range: a low and a high key and a number of partitions (n). It creates n partitions, each responsible for a non-overlapping subrange. Example: A ranged partitioning scheme (for a service with 3 replicas) with a low key of 0, a high key of 99 and a count of 4 would create 4 partitions as shown below:

ASF1
Figure 1: The service is divided into four partitions. Each partition is identified by integer ranges (0-24, 25-49, 50-74, and 75-99) with two secondary replicas. These ranges can be very large, like INT MAX, which can represent some unique ID for the user used to identify the correct partition.

<StatefulService ServiceTypeName="Stateful1Type"
      TargetReplicaSetSize="3"
      MinReplicaSetSize="2">
   <UniformInt64Partition PartitionCount="4"
      LowKey="0" HighKey="99" />
</StatefulService>

Partitions would look like this in the Service Fabric Explorer (Identified by Guid):

ASF2
Figure 2: Two partitions with three replicas each for NumberCountingService

Source Code

The sample code describing a Stateful service scalability and reliability scenarios has been uploaded to github at https://github.com/manoj-kumar1/ServiceFabric-ScalabilityRel. This solution has been developed using Visual Studio 2015. It can be opened using Visual Studio 2013, but the Service Fabric application project won’t open and extra steps would be required, as explained in my previous article.

Summary

In Azure Service Fabric, the strategy of service partitioning and a high replica count helps in achieving higher scalability and reliability for business applications. With the increase in the number of nodes in the cluster, these features also improve resource utilization and performance of the application.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read