Logician: A Table-based Rules Engine Suite In C++/.NET/JavaScript using XML

Introduction

Application logic, particularly business rules, can be messy and time consuming to maintain in code. If all your application logic is hard-coded, it can eventually lead to massive if-then-else or select-case code segments that could grow into huge nightmares. Developers have more important problems to solve and things to do than to maintain a mountain of string compares, boolean tests, or stored procedures. There have been numerous attempts at rules engines (aka "inference engines"), but many of them require the writing of even more cryptic looking code that is hard if not impossible for non-developers to maintain. De-coupling business logic from an application certainly makes for more robust and maintainable code, and provides you the ability to let non-developer subject matter experts maintain the data and rules model, provided it is logical and easy to understand. Of course, you can always link parts your application data to a database, but that still requires a lot of developer work to define the data model and queries needed for every unique "rule-driven" event in your application, not to mention the possibility of DB performance bottlenecks in a networked or limited resource environment. A table-based rules engine can be a very powerful and flexible solution for your application logic and automation needs. In a web environment, the Logician JavaScript libraries can also offload a lot of server CPU onto the user's browser and eliminate laggy server callbacks.

Decision Tables

A Decision Table, or "Truth Table" as a mathematician or electrical/computer engineer might call it, is simply a spreadsheet defining the possible solutions to a problem given a set of input conditions. For example, suppose we are mixing paint, and I show you the following spreadsheet:

PaintColor1  PaintColor2  ResultColor
Red          Blue         Purple
Red          Yellow       Orange
Blue         Yellow       Green

Without writing a single line of code, or even offering any more background information about the problem, the meaning of the data is clear. Programmatically, we can think of this as a series of if-then-else statements reading left to right, top to bottom:

	

if (PaintColor1 == "Red" && PaintColor2 == "Blue")
{
  ResultColor == "Purple"
}
else if (PaintColor1 == "Red" && PaintColor2 == "Yellow")
{
  ResultColor = "Orange"

}
//....etc

Or in SQL:


SELECT ResultColor WHERE PaintColor1 = @PaintColor1 AND PaintColor2 = @PaintColor2

Any one of these 3 solutions gets the job done, but the first is certainly easier to comprehend for non-developers and is the way a lot of practical engineering and/or business data is maintained in the real world. Using a DB to drive the rule certainly de-couples the logic from your source code to some extent, but in a web environment you will have to use server callbacks or web services to retrieve the data, slowing the application down. What if the logic needs to change, and we start mixing 3 colors of paint? For the DB, you might have to go back and change the DB table schema and your select statement/stored procedures. If you had a few hundred combinations of colors and then added a 3rd input parameter to the hard coding method, you are in for a whole lot of monotonous error-prone coding. If you had gone with the decision table, you likely would have very little work to do, other than to copy the new rules XML file (that a non-developer/subject matter expert likely edited for you) with the added 3rd input to your website or application. In this tutorial, I'll show you how to use the open source Logician Suite to accomplish this task. With the Logician package you get 3 basic components, a decision table evaluator library, a decision table editor, and a dynamical data/class modeler and rules engine library.

Decision Table Engine Background

As stated before, the table rules read sequentially, for a given series of inputs, the output(s) are determined. In the example above they work top to bottom, left to right. The decision table evaluator (EDSEngine library) creates a truth table given the information supplied to it by your code and the stored XML rule data. By itself it is stateless, but it is easy to determine and supply the necessary variables. The basic steps that occur are:

1. Code determines it needs to evaluate a table, asks EDSEngine what input values from the current application "state" it needs.
2. EDSEngine loads the table and returns the list of inputs in the table.
3. Code provides the corresponding list of current values for those inputs.
4. EDSEngine evaluates the decision table and returns results.

You should be able to automate steps 1-3 in your code, depending on how you design your data model. Something as simple as this might work:

//C++
map<string, string> mAppData; //application state as attribute-value pairs
CKnowledgeBase m_TableEvaluator;

//...application stuff, you loaded the rules file, etc

string GetResultingColor()
{  
  return GetSingleSolution("ColorMixingTable", "ResultColor");  
}

string GetSingleSolution(string tableToEvaluate, string nameOfOutput) //could reuse this function for all similar aplication events

{
  vector<string> inputsNeeded = m_TableEvaluator.GetInputDependencies(tableToEvaluate);
  //from our application data, obtain the values
  for (int i = 0; i < inputsNeeded.size(); i++)
    m_TableEvaluator.SetInputValue(inputsNeeded[i], mAppData[inputsNeeded[i]]);
  
  vector<string> results = m_TableEvaluator.EvaluateTable(tableToEvaluate, nameOfOutput);
  //EDSEngine supports returning multiple true results on a sigle line, but in this case we expect just a single result (the first one it finds)

  
  if (results.size() > 0)
    return results[0];
  else
    return "";
}

See the source code for this example in the ColorMixConsole application. Rule tables are stored as XML, and when "compiled" by the DecisionLogic table editor utility, linked together in a single XML file. All the values stored within the rules engine are natively strings since they get serialized to XML. In order to optimize performance, string compares are avoided when possible by numerically tokenizing all of the stored values in the rules table and any input values passed in. That way, it is just comparing numbers most of the time. So in memory, the previous paint color table looks more like:

PaintColor1   PaintColor2   ResultColor
0             1             3
0             2             4

1             2             5

Suppose we pass "Blue" and "Yellow" to the previous paint table. The values for PaintColor1 and PaintColor2 that we are testing are likewise assigned 1(Blue) and 2(Yellow). You can also perform the following boolean operations on an input value, de-tokenization will occur:
> : greater-than, alpha or numerical
< : less-than, alpha or numerical
!= or <> : not-equal to
[x,y] : range of values, inclusive ends
(x,y) : range of values, exclusive ends. You can mix [] and ()
= : not used explicitly, this is the default behavior for a rule cell and does not require the string to be de-tokenized

At run-time, once you pass in the input values for the table, it is broken down sequentially into a series of boolean cells, where the value of each cell is either true or false. Any input cell that you leave blank is always considered true. So if we passed the values PaintColor1 = "Blue" and PaintColor2 = "Yellow" our previous decision table looks a lot like a logical AND gate:

PaintColor1   PaintColor2   ResultColor
F             F             F
F             T             F
T             T             T <==This is our solution, corresponding to the tokenized memory value of 5, whose string value is "Green"

Other EDSEngine Features of Note 

You can specify more than one value in an input cell, this is called an "OR", and the test will check them against the input value just like an "or" in code: if (value1 = test || value2 = test || value3 = test ) then do something...is abbreviated as value1|value2|value3 in a cell. There is also a notion of "Global ORs" if you design the rules XML using the table designer tool, DecisionLogic. Listing many values out can be a lot of extra typing, so you can define a single list of values as a variable and reuse that variable in all of your project tables. In an output cell, the "|" delimiter acts like an "and" (&&). In this way your solution can return multiple values. The results of a table evaluation are always returned as an array (vector in C++). There is also the notion of a table being "Get One" or "Get All", which means the table designer intended for you to either return just the result of the first true row, or the combined unique results of all true rows. This is selectable in the DesicionLogic designer for every table. You of course always have the option to override it in code.

You can dynamically concatenate values into cells at run-time using the get() keyword. For instance, suppose we need an output of text for a price list display, and want to drive the text by rule. We might want it to read: "You have purchased X items of price P". In the table we could create an output: "You have purchased get(QtyOfItems) of price get(ItemPrice)" where QtyOfItems and ItemPrice are values in my application state that would have been supplied. You can also use a get() in an input to create a more dynamic test. Instead of an input cell of ">55", it could be ">get(SomeValue)".

Run-time scripting with Python (C++/C#) and JavaScript (All ports) are supported in output cells so you can perform mathematical calculations and implement more advanced rules. Your output cell will just contain the Python or JavaScript code snippet within the proper keyword, js() or py(). For a single line of code it might look like:

js(return (56 * 3).toString())

//Note: you can actually omit the "return" and ".toString()" for a single line of code:
js(56 * 3)
//Combine eqautions and variables
js(56 * get(MultValueFromCode))

If your code has multiple lines/functions make sure it explicitly returns a string at the end or you will get a type-casting error. This becomes more useful when combined with the get() keyword like: js(get(value1) * get(value2)). Also note that Python is only supported in the C++/C# implementation of EDSEngine. JavaScript-based scripting is a bit more portable for the web being the native run-time scripting language of web browsers.

A rather advanced but flexible feature is callback parameters. There are special table evaluation functions with overloads to support passing additional data to EDSEngine, that is also passed to the JavaScript or Python code (EvaluateTableWithParameter). The basic idea is maybe you want to send some text or XML data from your application to a rule, modify it in the script, and pass the modifications back along with the usual result. You can find more details in the developer's documentation if the feature might be useful to you.

Relational Object Model and Implementing a Rules Engine

The use of the Relational Object Model library will demonstrate how you can extend EDSEngine with your own features for a full-blown rules engine. Instead of writing explicit classes to model physical products in an eCommerce setting, it may be useful to model the product using a tree-like object structure, similar to XML. For instance, suppose we were modeling a car. We might write C++ classes like:

class CPriceableItem
{
public:
  CPriceableItem();
  string CatalogNumber;
  double Price;
  double Cost; 
};


class CEngine : public CPriceableItem
{
public:
  CEngine();
  string EngineType;
};

class CTires : public CPriceableItem
{
public:
  CTires();
  string TireType;
};
//etc, keep inheriting and adding special attributes to each class

If we model the whole Car as XML and work with it directly, the final state could instead be formed like:

<Object name='Car'>
  <Object name='Engine'>

    <Attribute EngineType='V6' Price='9000' Cost='4000' CatalogNumber='V6-OCTC-GM'></Attribute> 
  </Object>

  <Object name='Tires'>
    <Attribute TiresType='17inch' Price='500' Cost='175' CatalogNumber='GY17'></Attribute> 
  </Object>

  <!--And so forth.....-->
</Object>

In the remainder of the tutorial we will take advantage of ROM's built in automation with EDSEngine. You may also find it convenient not to have to write code for many algorithms you may need, such as to sum up the total price of the Car object. ROM supports treating the data like XML and supports XPATH queries. You could just use an XPATH query to get the total price of the Car object, and could even do it in a output cell of a table rule using the eval() keyword: "Total price is eval(sum(//Attribute[@Price]))", yielding the final text result of "Total price is 9500".

When using the ROM component, decision tables are evaluated against a particular "Object" node context, and can drill down into the parent nodes when an input dependency value is not found in the current context. You can also use XPATH queries in your input column headers instead of dealing with input values in code, or to specify a relation between multiple "Object" nodes. See the project documentation for more information. It should be noted that the internal data storage mechanism is not XML since that would create a performance bottleneck. However, the current state can be serialized at any time, and is updated whenever a query is made.

Logician: A Table-based Rules Engine Suite In C++/.NET/JavaScript using XML

Tutorial: How to Model a Real World Product Configurator

In this tutorial we will demonstrate how to model a real world product configurator in both a JavaScript-enabled webpage and a C# WinForms application using the Logician tools (A C++, Silverlight, Flash, and Android example is also available). For this demonstration, we will attempt to model the properties and catalog number generation for a commercially available "Heavy-Duty Imperial Mill Type" Hydraulic Cylinder (see technical specification PDF with source code). Given the formula for the generation of a product catalog number, a simple graphical layout, or group of selections can be defined. The names of each control will match the attribute names we use in the decision tables.

Before any rule evaluation can take place, you have to substantiate an instance of the Relational Object Modeler. You can pass the path to the rules XML file we will create also to load them in the second step, and finally set up the built-in rules engine implementation and apply the rules:

//C# port, Javascript is exactly the same without the namepsaces
ROMNET.ROMNode m_rootNode = new ROMNode("HydraulicCylinder");
m_rootNode.LoadRules("HydraulicCylinderRules.xml");
ROMNET.LinearEngine m_engine = new ROMNET.LinearEngine(m_rootNode, "HydraulicCylinderDictionary");
m_engine.EvaluateAll();

With a minimal GUI interface of various drop-down and edit boxes built according to the properties in the product's documentation, let's start defining some simple rules based on the product literature so we can test out the application event/evaluation code we will build shortly. Open the DecisionLogic table editor and start a new project. For each control in the interface, we will define a separate decision table with the same name that specifies the available values given any necessary input conditions. Also, ROM requires that we create at least 1 "Dictionary" table that contains a list of all the attribute names we are using, the captions/descriptions for each, etc. Create a new table named "HydraulicCylinderDictionary", and fill it out with the attributes we will be using to model the product. They should match the control names created in the configurator GUI. The "Name" and "Description" columns are self-explanatory. The "DefaultValue" column will set the value of the attribute on application start-up if no rule table is defined for it in the "RuleTable" column. The "AttributeType" column must be one of the following types:

SINGLESELECT - for controls such as combo-boxes/drop-downs, radio buttons, etc
MULTISELECT - multi-selection list boxes
BOOLEAN - checkboxes
EDIT - text editing fields
STATIC - read-only attributes. Values can be set by the "DefaultValue" column, or evaluated by rule.

Figure 1
[Figure1.png]

The first input is "CylinderSeries". Being the first parameter of the cylinder configuration, it has no input parameters so all of the "input" columns can be removed from the table. From the product documentation we can see that there are 3 possibilities: 3000 psi Hydraulic, 2000 psi Hydraulic, and 250 psi Pneumatic. We can list them out in 3 separate output cells in a "get all" table, or put all 3 in a single output cell separated by a "|" as described earlier. In this case, it's just a matter of your personal style. Here we will do the latter:

Figure 2
[Figure2.png]

We continue filling out a rules table for each attribute we have defined in the "dictionary". An example of an attribute that will have input conditions would be "RodDiameter". In the documentation the available rod sizes appear to be limited by the chosen BoreDiameter. Since there are multiple possibilities for each BoreDiameter, it might save the user time if we automatically set a RodDiameter default value as well when they pick a bore. This can be done by prefixing a value with the "@" symbol. It then follows that we could create the following rules table:

Figure 3
[Figure3.png]

Any decent rules engine should be able to identify invalid conditions and appropriate "triggers" to force a re-validation of the current selections. When you test the sample application, you will notice that if you change the BoreDiameter to a different value, and the currently selected RodDiameter would be out of the valid range, the value of that on-screen selection will be cleared or changed to a correct value depending on the current conditions. Any changes to the dictionary attributes are routed through event handling and the "EvaluateForAttribute" function of the rules engine, which will apply any necessary changes to the current attribute values based on the rule set. The object modeler has methods available to us in order to get or set the value of an attribute, or to dump the entire application state to XML as may be needed.

In order to generate the appropriate catalog number for a given set of attribute selections, or "state", we will define another set of rules tables and call for a table evaluation directly:

//C# port
private void UpdateCatalog()
{
  //catalog number is the concat of all the chars returned from the CatalogNumber table evaluation

  string[] allChars = m_rootNode.EvaluateTable("CatalogNumber", "Code", true);
  string Catnum = "";
  foreach (string subStr in allChars)
    Catnum += subStr;

  if (Catalog != null)
    Catalog.Text = Catnum;         
}

Since putting all of the rules for the catalog number string in one table would be a mess, we can create a "get all" style table and evaluate a separate table for each character in the catalog number using the eval(TableName, OuputColumnName) table function as shown here to branch from one table to another:

Figure 4
[Figure4.png]


Figure 5
[Figure5.png]


Figure 6
[Figure6.png]

One of the great advantages of using these packages on the web is that you can effectively eliminate server callbacks to run business logic and perform page updates. You have the ability to offload a lot of the application logic and CPU cycles onto the user's browser. With all of these components working in concert, you have Logician, a powerful, flexible, and open source rules engine application framework.  

Download the sample application code in C#, C++, JavaScript, Flash, Silverlight, and the Android SDK (via PhoneGap JavaScript wrapping) along with the rule table samples to get a better understanding of the application logic. Visit our project webpage at http://logician.sourceforge.net



Downloads

Comments

  • There are no comments yet. Be the first to comment!

Leave a Comment
  • Your email address will not be published. All fields are required.

Top White Papers and Webcasts

  • Hybrid cloud platforms need to think in terms of sweet spots when it comes to application platform interface (API) integration. Cloud Velocity has taken a unique approach to tight integration with the API sweet spot; enough to support the agility of physical and virtual apps, including multi-tier environments and databases, while reducing capital and operating costs. Read this case study to learn how a global-level Fortune 1000 company was able to deploy an entire 6+ TB Oracle eCommerce stack in Amazon Web …

  • Event Date: April 15, 2014 The ability to effectively set sales goals, assign quotas and territories, bring new people on board and quickly make adjustments to the sales force is often crucial to success--and to the field experience! But for sales operations leaders, managing the administrative processes, systems, data and various departments to get it all right can often be difficult, inefficient and manually intensive. Register for this webinar and learn how you can: Align sales goals, quotas and …

Most Popular Programming Stories

More for Developers

Latest Developer Headlines

RSS Feeds