Understanding the .NET Task Parallel Library TaskScheduler

The most talented orchestra will sound uninspired without an
equally talented conductor. Similar conclusions can be drawn to a Parallel
Programming
workload conductor. The .NET Task Parallel
Library
(TPL) Parallel programming conductor is the TaskScheduler.
TaskScheduler is arguably one of the more enigmatic TPL classes. It’s hardly
noticeable; yet it’s critical to a smoothly executing Parallel workload.

Compounding its mystery is the appearance of multiple TPL
TaskSchedulers; the Default and specialized TaskSchedulers that, for example,
execute Tasks on the Windows
Presentation Foundation
(WPF) user interface threads. The best way to
understand TaskScheduler is to implement a custom TaskScheduler, run a parallel
workload on the custom TaskScheduler, and observe what happens. That’s exactly
what this article will do, beginning with an overview of the sample workload.

Overview

A demonstration Parallel workload appears below.

static void Main(string[] args)
{
    var tasks = new Task[4];
    var scheduler = new SimpleScheduler();

    using (scheduler)//Automatically invoke dispose when you exit using.
    {

        Task taskS1 = new Task(() =>
        { Write("Running 1 seconds"); Thread.Sleep(1000); return "String value 1.."; });
        tasks[0] = taskS1;

        Task taskS2 = new Task(() =>
        { Write("Running 2 seconds"); Thread.Sleep(2000); return "String value 2.."; });
        tasks[1] = taskS2;

        Task taskS3 = new Task(() =>
        { Write("Running 3 seconds"); Thread.Sleep(3000); return "String value 3.."; });
        tasks[2] = taskS3;

        Task taskS4 = new Task(() =>
        { Write("Running 4 seconds"); Thread.Sleep(4000); return "String value 4.."; });
        tasks[3] = taskS4;

        foreach (var t in tasks)
        {
            t.Start(scheduler);
        }


        Write("Press any key to quit..");
        Console.ReadKey();

    }
}
static void Write(string msg)
{
    Console.WriteLine(DateTime.Now.ToString() + " on Thread " + Thread.CurrentThread.ManagedThreadId.ToString() + " -- " + msg);

}

The code executes 4 Tasks on TPL. A complete introduction to
Tasks is beyond the scope of this article, but Understanding
Tasks in .NET Framework 4.0 Task Parallel Library
is a good introduction.

Tasks simulate a workload executing Thread.Sleeps inside
Llamda expressions. The running code produces output resembling what you see below.

Starting Thread 10
5/3/2011 8:27:45 PM on Thread 9 -- Press any key to quit..
5/3/2011 8:27:45 PM on Thread 10 -- Running 1 seconds
5/3/2011 8:27:46 PM on Thread 10 -- Running 2 seconds
5/3/2011 8:27:48 PM on Thread 10 -- Running 3 seconds
5/3/2011 8:27:51 PM on Thread 10 -- Running 4 seconds

Part of the output indicates the Thread Id of the code
executing the Write statement. Thread Id 9 is the application’s main Thread. Thread
10 is a Thread created by the SimpleScheduler TaskScheduler class. Later in the
article I’ll explain where this Thread is created.

As stated earlier the article will walk through a custom
TaskScheduler. SimpleScheduler is a custom TaskScheduler implementation.

SimpleScheduler Architecture

TaskScheduler is an Abstract class. So, building a custom
Scheduler requires some Overrides. A list of Overridable functions appears
below.

public virtual int MaximumConcurrencyLevel { get; }
protected abstract IEnumerable GetScheduledTasks();

protected internal abstract void QueueTask(Task task);

protected internal virtual bool TryDequeue(Task task);
protected abstract bool TryExecuteTaskInline(Task task, bool
taskWasPreviouslyQueued);

As you may have noticed all the overrides are protected. Like
other .NET components TPL is as much a Runtime environment as it is a
collection of classes. When a Task invokes Start, the Runtime funnels the Task
to the QueueTask and the TryExecuteInLine methods on the selected
TaskScheduler.

Overriding MaximumConcurrencyLevel and TryDequeue are
optional. Every class must Override QueueTask, TryExecuteInLine, and
GetScheduledTasks.

Overriding GetScheduledTasks is required for debugger
support.

For later reference, the full source code for
SimpleScheduler appears below.

public sealed class SimpleScheduler : TaskScheduler, IDisposable
{
    private BlockingCollection<Task> _tasks = new BlockingCollection<Task>();
    private Thread _main = null;

    public SimpleScheduler()
    {
        _main = new Thread(new ThreadStart(this.Main));
    }

    private void Main()
    {
        Console.WriteLine("Starting Thread " + Thread.CurrentThread.ManagedThreadId.ToString());

        foreach (var t in _tasks.GetConsumingEnumerable())
        {
            TryExecuteTask(t);
        }
    }

    /// <summary>
    /// Used by the Debugger
    /// </summary>
    /// <returns></returns>
    protected override IEnumerable<Task> GetScheduledTasks()
    {
        return _tasks.ToArray<Task>();
    }


    protected override void QueueTask(Task task)
    {
        _tasks.Add(task);

        if (!_main.IsAlive) { _main.Start(); }//Start thread if not done so already
    }


    protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
    {
        return false;
    }


    #region IDisposable Members

    public void Dispose()
    {
        _tasks.CompleteAdding(); //Drops you out of the thread
    }

    #endregion
}

QueueTask is the heart of the SimpleScheduler.

QueueTask

The QueueTask implementation appears below.

    protected override void QueueTask(Task task)
    {
        _tasks.Add(task);

        if (!_main.IsAlive) { _main.Start(); }//Start thread if not done so already
    }

As stated earlier, the TPL runtime funnels Tasks to the
QueueTask method. QueueTask does two things. First it adds the incoming Task to
a BlockingCollection. A complete introduction to BlockingCollection is beyond
the scope of this article, but Introducing
the .NET Framework 4.0 Task Parallel Library BlockingCollection
is a
helpful introduction.

After adding to the BlockingCollection; QueueTask starts a
Thread that removes Tasks from the BlockingCollection and executes the Task.

Executing a Task

The Main method runs inside of an executing Thread. As you
may recall in the output displayed earlier in the article; this is Thread Id
#10. Code for the Main method appears below.

    private void Main()
    {
        Console.WriteLine("Starting Thread " + Thread.CurrentThread.ManagedThreadId.ToString());

        foreach (var t in _tasks.GetConsumingEnumerable())
        {
            TryExecuteTask(t);
        }
    }

GetConsumingEnumerable returns a Task each time a Task is
added to the underlying BlockingCollection. The Foreach loop breaks when
CompleteAdding is invoked on the BlockingCollection. In the sample;
CompleteAdding is invoked inside the Dispose method. Failing to break out of
the loop will keep the Thread alive and waiting for more Tasks.

After the Task executes; the underlying Task populates its
Result value. Had there been Wait statements or Continuations these statements
would have executed just like any other code attached to the Task.

Recommendations

Aside from the samples, there is not a lot of guidance for
building TaskSchedulers. Much of what is written here comes from tinkering with
the samples. In fact much of the Microsoft documentation recommends using the
default TaskScheduler unless a developer has some really unique scenarios.

Conclusion

TaskSchedulers are an important component of .NET Framework
Task Parallel Library. However TaskSchedulers are classes few developers will
ever need to implement. A simple understanding of the TaskScheduler role is
adequate for leveraging TPL.

Resources

Task Schedulers

Task Schedulers and Synchronization Context

More by Author

Must Read