# Shannon Entropy and .NET

CodeGuru content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

## Introduction

Hello, and welcome to my article. Sometimes, I wonder where all my ideas for my articles come from; at least I now know why I cannot sleep most evenings. My brain doesn’t switch off. It is a blessing and a curse.

Today, you will learn how to make use of the Shannon Entropy equation to work out probabilities in your .NET applications.

## Entropy

Entropy can be defined in the context of a probabilistic model. For example: A coin flip has an entropy of 1 bit per coin flip. A string that always generates a long sequence of As has an entropy of 0, because the next character in the string will always be an ‘A’.

## Shannon Entropy

Claude Shannon’s entropy measures information contained in a message; for example: redundancy in language structure, and information about the occurrence frequencies of letter or word pairs, and so on. Shannon entropy provides a way to determine the average minimum number of bits needed to encode a string, based on the frequency of the symbols inside the string.

## Our Project

Create a new C# or Visual Basic.NET Windows Forms project. Once the default form has loaded, add one Button and one ListBox to it.

## Code

C#

```using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
```

VB.NET

```Imports System
Imports System.Collections.Generic
Imports System.IO
Imports System.Linq
```

C#

```   SortedList<byte, int> slTimeSymbolAppears;

SortedList<byte, double> slEntropy;

double dblEntropy;

bool blnUsed;

int iSize;
```

VB.NET

```   Private slTimeSymbolAppears As SortedList(Of Byte, Integer)
Private slEntropy As SortedList(Of Byte, Double)
Private dblEntropy As Double
Private blnUsed As Boolean
Private iSize As Integer
```

slTimeSymbolAppears contains each occurrence of the desired symbol. dblEntropy will contain the result of the process and blnUsed is True or False depending on whether or not a symbol has been used. Add the Properties.

C#

```   public int Size
{

get
{
return iSize;
}

private set
{
iSize = value;
}

}

public int Unique
{

get
{
return slTimeSymbolAppears.Count;
}

}

public double Entropy
{

get
{
return GetEntropy();
}

}

public Dictionary<byte, int> Distribution
{

get
{
return SortedDistribution();
}

}

public Dictionary<byte, double> Probability
{

get
{
return SortedProbability();
}

}
```

VB.NET

```   Public Property Size As Integer

Get

Return iSize

End Get

Private Set(ByVal value As Integer)

iSize = value

End Set

End Property

Public ReadOnly Property Unique As Integer

Get

Return slTimeSymbolAppears.Count

End Get

End Property

Public ReadOnly Property Entropy As Double

Get

Return GetEntropy()

End Get

End Property

Public ReadOnly Property Distribution As Dictionary(Of Byte, _
Integer)

Get

Return SortedDistribution()

End Get

End Property

Public ReadOnly Property Probability As Dictionary(Of Byte, _
Double)

Get

Return SortedProbability()

End Get

End Property
```

Add the reset of the Functions and the Constructor.

C#

```   public byte GreatestDistribution()
{

return slTimeSymbolAppears.Keys[0];

}

public byte GreatestProbability()
{

return slEntropy.Keys[0];

}

public double SymbolDistribution(byte bSymbol)
{

return slTimeSymbolAppears[bSymbol];

}

public double SymbolEntropy(byte bSymbol)
{

return slEntropy[bSymbol];

}

public Dictionary<byte, int> SortedDistribution()
{

List<Tuple<int, byte>> lstEntries = new
List<Tuple<int, byte>>();

foreach (KeyValuePair<byte, int> e in slTimeSymbolAppears)
{

}

lstEntries.Sort();
lstEntries.Reverse();

Dictionary<byte, int> dicResult = new
Dictionary<byte, int>();

foreach (Tuple<int, byte> e in lstEntries)
{

}

return dicResult;

}

public Dictionary<byte, double>SortedProbability()
{

List<Tuple<double, byte>> lstEntries = new
List<Tuple<double, byte>>();

foreach (KeyValuePair<byte, double> e in slEntropy)
{

}

lstEntries.Sort();
lstEntries.Reverse();

Dictionary<byte, double> dicResult = new
Dictionary<byte, double>();

foreach (Tuple<double, byte> e in lstEntries)
{

}

return dicResult;

}

public double GetEntropy()
{

if (!blnUsed)
{

return dblEntropy;

}

dblEntropy = 0;
slEntropy = new SortedList<byte, double>();

foreach (KeyValuePair<byte, int> e in slTimeSymbolAppears)
{

(double)iSize);

}

foreach (KeyValuePair<byte, double> e in slEntropy)
{

dblEntropy += e.Value * Math.Log((1 / e.Value), 2);

}

blnUsed = false;

return dblEntropy;
}

public void GetBytes(byte[] bBytes)
{
if (bBytes.Length < 1 || bBytes == null)
{

return;

}

blnUsed = true;

iSize += bBytes.Length;

foreach (byte bt in bBytes)
{

if (!slTimeSymbolAppears.ContainsKey(bt))
{

continue;

}

slTimeSymbolAppears[bt]++;

}
}

public void GetBytes(string strBytes)
{

GetBytes(StringToByteArray(strBytes));

}

byte[] StringToByteArray(string strInput)
{

char[] c = strInput.ToCharArray();

IEnumerable<byte> b = c.Cast<byte>();

return b.ToArray();

}

void Clear()
{

blnUsed = true;

dblEntropy = 0;
iSize = 0;

slTimeSymbolAppears = new SortedList<byte, int>();
slEntropy = new SortedList<byte, double>();

}

public ShannonEntropy(string fileName)
{
Clear();

if (File.Exists(fileName))
{

GetEntropy();
SortedDistribution();

}
}

public ShannonEntropy()
{

Clear();

}
```

VB.NET

```   Public Function GreatestDistribution() As Byte

Return slTimeSymbolAppears.Keys(0)

End Function

Public Function GreatestProbability() As Byte

Return slEntropy.Keys(0)

End Function

Public Function SymbolDistribution(ByVal bSymbol As Byte) _
As Double

Return slTimeSymbolAppears(bSymbol)

End Function

Public Function SymbolEntropy(ByVal bSymbol As Byte) As Double

Return slEntropy(bSymbol)

End Function

Public Function SortedDistribution() As Dictionary(Of Byte, _
Integer)

Dim lstEntries As List(Of Tuple(Of Integer, Byte)) = New _
List(Of Tuple(Of Integer, Byte))()

For Each e As KeyValuePair(Of Byte, Integer) In _
slTimeSymbolAppears

e.Key))

Next
lstEntries.Sort()
lstEntries.Reverse()

Dim dicResult As Dictionary(Of Byte, Integer) = New _
Dictionary(Of Byte, Integer)()

For Each e As Tuple(Of Integer, Byte) In lstEntries

Next

Return dicResult

End Function

Public Function SortedProbability() As Dictionary(Of Byte, _
Double)

Dim lstEntries As List(Of Tuple(Of Double, Byte)) = New _
List(Of Tuple(Of Double, Byte))()

For Each e As KeyValuePair(Of Byte, Double) In slEntropy

Next

lstEntries.Sort()
lstEntries.Reverse()

Dim dicResult As Dictionary(Of Byte, Double) = New _
Dictionary(Of Byte, Double)()

For Each e As Tuple(Of Double, Byte) In lstEntries

Next

Return dicResult

End Function

Public Function GetEntropy() As Double

If Not blnUsed Then

Return dblEntropy

End If

dblEntropy = 0
slEntropy = New SortedList(Of Byte, Double)()

For Each e As KeyValuePair(Of Byte, Integer) In _
slTimeSymbolAppears

CDbl(iSize))

Next

For Each e As KeyValuePair(Of Byte, Double) In slEntropy

dblEntropy += e.Value * Math.Log((1 / e.Value), 2)

Next

blnUsed = False

Return dblEntropy

End Function

Public Sub GetBytes(ByVal bBytes As Byte())

If bBytes.Length < 1 OrElse bBytes Is Nothing Then

Return

End If

blnUsed = True
iSize += bBytes.Length

For Each bt As Byte In bBytes

If Not slTimeSymbolAppears.ContainsKey(bt) Then

Continue For

End If

slTimeSymbolAppears(bt) += 1

Next

End Sub

Public Sub GetBytes(ByVal strBytes As String)

GetBytes(StringToByteArray(strBytes))

End Sub

Private Function StringToByteArray(ByVal strInput As String) _
As Byte()

Dim c As Char() = strInput.ToCharArray()

Dim b As IEnumerable(Of Byte) = c.Cast(Of Byte)()

Return b.ToArray()

End Function

Private Sub Clear()

blnUsed = True
dblEntropy = 0
iSize = 0

slTimeSymbolAppears = New SortedList(Of Byte, Integer)()

slEntropy = New SortedList(Of Byte, Double)()

End Sub

Public Sub New(ByVal fileName As String)

Clear()

If File.Exists(fileName) Then

GetEntropy()
SortedDistribution()

End If

End Sub

Public Sub New()

Clear()

End Sub
```

C#

```namespace ShannonEntropy_C
{
public partial class Form1 : Form
{
ShannonEntropy se = new
ShannonEntropy(@"C:\\Temp\\TestFile.txt");
public Form1()
{
InitializeComponent();
}

private void button1_Click(object sender, EventArgs e)
{
double ge = se.GetEntropy();

}
}
}
```

VB.NET

```Public Class Form1

Private se As ShannonEntropy = New _
ShannonEntropy("C:\Temp\TestFile.txt")
Private Sub button1_Click(sender As Object, e As EventArgs) _
Handles button1.Click

Dim ge As Double = se.GetEntropy()

End Sub

End Class
```

When you click the button, it will calculate and display the Entropy. I have included the Textfile, but keep in mind that it must be referenced properly and you might not have a Temp folder on your disk.

Figure 1 shows the result.

Figure 1: Running

## Conclusion

In this article, you have learned how useful entropy can be in determining repetitive values. Until next time, happy coding!

Hannes DuPreez
Ockert J. du Preez is a passionate coder and always willing to learn. He has written hundreds of developer articles over the years detailing his programming quests and adventures. He has written the following books: Visual Studio 2019 In-Depth (BpB Publications) JavaScript for Gurus (BpB Publications) He was the Technical Editor for Professional C++, 5th Edition (Wiley) He was a Microsoft Most Valuable Professional for .NET (2008–2017).