9 min read

Threads

The Crash Course

Introduction

Modern computers have multiple cores, which allow the computer to run many statements simultaneously. This can be used to run multiple applications at the same time, or to let multiple parts of a single program run at the same time (or both).

Games can often take advantage of this ability to let them do more faster. For example, if you can run your physics subsystem across eight cores instead of just one, you may be able to add eight times as many objects to your game and still have the game run smoothly.

Running code across multiple cores is not a simple programming task. It greatly increases the complexity of your code, and can lead to lots of problems when data is accessed on two different cores at the same time. When you decide you need to do this, plan on spending a fair bit of time sorting through issues caused by it as well.

Still, running a program on many cores simultaneously has become an important capability, and it is worth briefly touching on it in this tutorial. We’ll cover a handful of the basics, but keep in mind that we’re only scratching the surface of this topic.

Threads

Most modern operating systems, including Windows, Linux, and Mac, have a concept of a process , which is a single application or service. The programs you make are compiled into a .dll or .exe file, and when the operating system is asked to start it, it loads your program into memory and runs it as a process.

Operating systems also have a concept called a thread . A thread is, in essence, that concept of the flow of execution. Each thread knows which instruction it is on currently, and keeps track of its state, such as which method it is running in, that method’s local variables, the method that called the current method, that method’s local variables, and so on, back up the call chain.

Each process contains at least one thread, often referred to as the main thread. By creating additional threads, you can be executing at multiple places in your program simultaneously.

The operating system is juggling many processes and their collection of threads. There are usually far more threads and processes running than there are cores. So a component of the operating system called the scheduler will choose which threads should be running at any given moment, cycling them around to ensure each thread gets a chance to make progress on the work it is trying to do.

Creating a Thread

Creating a thread in C# is actually quite simple. It is done by creating a new System.Threading.Thread object. You may want to put a using System.Threading; directive in your file to make it easy to work with the Thread class (otherwise, you’ll have to use its full name, System.Threading.Thread).

note

In C# 10 and .NET 6, this namespace will automatically be included implicitly in most project types. You won’t need to specifically go add using System.Threading; to your files after about mid-November 2021 in most cases.

Creating the thread is very simple:

Thread thread = new Thread(MethodToRun);
thread.Start();

You create a new Thread object and give it the method to run. This is a delegate. It must be a method with a void return type and no parameters.

note

We’ll see how to pass parameters to it in a moment.

With this code in place, the original thread will continue on to additional instructions after the thread.Start(); call, while the new thread will begin running inside of the method you gave it (MethodToRun).

A somewhat complete example that uses top-level statements might look like this:

using System;
using System.Threading;

Thread countingThread1 = new Thread(CountTo1000);
Thread countingThread2 = new Thread(CountTo1000);

countingThread1.Start();
countingThread2.Start();

Console.ReadKey(); // Make the main thread wait for keyboard input instead of just ending the program.



void CountTo1000()
{
    for (int number = 1; number <= 1000; number++)
        Console.WriteLine(number);
}

Waiting for a Thread to Finish

Very often, we want to wait for a thread to complete before we continue on. One thread can stop and wait for another to finish by using the Join method:

We can modify our program above by stripping out the Console.ReadKey(); line and replacing it with this:

countingThread1.Join();
countingThread2.Join();

In this case, the main thread will stop running other instructions until each of the two counting threads finishes their work.

Passing Data Between Threads

One limitation of what we did above is that the method the thread runs must have a void return type and no parameters. That prevents us from sharing any sort of data between multiple threads.

Instead of supplying a method with no parameters, we also have the option to give it a method with a single object-typed parameter. Remember from our discussion on inheritance that object is the base class of everything, so we can technically pass in any object we want. That can be an object with many properties, which is a (somewhat) convenient way to both pass multiple values to the method as well as return data back. The following is overkill for using a thread, but shows how you could use it:

using System;
using System.Threading;

Thread addThread = new Thread(Add);

addThread.Start(new AdditionProblem(2, 3));
addThread.Join();

Console.WriteLine(addThread.Result);


void Add(object obj)
{
    AdditionProblem additionProblem = (AdditionProblem)obj;

    additionProblem.Result = additionProblem.First + additionProblem.Second;
}

class AdditionProblem
{
    public float First  { get; }
    public float Second { get; }
    public float Result { get; }

    public AdditionProblem(float first, float second)
    {
        First = first;
        Second = second;
    }
}

Note that when we call the thread’s Start method, we pass in a parameter: new AdditionProblem(2, 3). This allows us to use a method with a single object-typed parameter. But we need to cast it to the AdditionProblem once we’re in the Add method.

Again, this simple of a problem would have been far more efficient had we just done float result = 2 + 3; and skipped the threads altogether, but it shows the key concepts for when you have a more challenging problem that requires threads.

Thread Safety

Before we leave the topic of threads, there is one more important concept that we need to touch on: thread safety . When two threads both have a chance of accessing the same data at the same time, we introduce a chance that it could cause scary, hard-to-find bugs.

Consider this simple code, which increments a variable:

number++;

This is written as a single line, but behind the scenes, there are three steps:

  1. Read the value out of number.
  2. Add one to that value.
  3. Store the result back into number.

Now suppose we have two threads that are trying to do a similar thing. This is definitely more complicated, but consider this code:

using System;
using System.Threading;

NumberWrapper numberWrapper = new NumberWrapper();

Thread t1 = new Thread(Increment);
t1.Start(numberWrapper);
Thread t2 = new Thread(Increment);
t2.Start(numberWrapper);

t1.Join();
t2.Join();

void Increment(object obj)
{
    NumberWrapper wrapper = (NumberWrapper)obj;
    wrapper.Increment();
}

class NumberWrapper
{
    private float _number;
    public float Number
    {
        get
        {
            return _number;
        }
        set
        {
            _number = value;
        }
    }

    public void Increment()
    {
        _number++;
    }
}

That’s a lot of code, so take a moment to understand it. It’s not wildly different from our previous example, but two threads will have access to the same NumberWrapper instance, and both will try to modify it at about the same time.

That word “about” is the key word there. The operating system’s scheduler will pick and choose when to let threads run, as well as when to suspend them to let another thread go. There’s a chance that both of these threads are running at the same time, and also a chance that one will be part way through that three step increment process when it gets suspended.

In that case, there’s a chance that a thread will have done Step 1 (reading the value out of _number) but not have updated it in Step 3. That would make it so the second thread can get in there and read the value. They both read the current value (which starts at 0) and use that to add one and update _number, meaning it is possible that even though two operations that increment the variable ran to completion (after both threads finish) there’s a chance that the value in _number could actually be 1!

I’ve ran this code enough times to see that it is quite rare. And when you inspect the code, the problem isn’t obvious. That makes it especially insidious. You’ll go for four months with no problems, then get a report from somebody that there’s a bug. You’ll try to recreate it on your computer and not be able to, because the chance is so small. Then you’ll go inspect the code and come to the (false) conclusion that “there’s no way it could be failing like that!” Alas, there is, it is just hard to recreate.

warning

That above scenario is exactly why you should only start using multiple threads when you have a definite need for it. Not all programs need or benefit from multiple threads. The pain of troubleshooting multi-threaded applications is quite high, and you should plan on paying for it with your time. Still, there are plenty of other scenarios where the benefits are well worth the costs, and you just plan on absorbing those costs to get the benefits.

Bottom line: don’t do it unless you need to, but it’s a useful skill to have, and you’ll put it to use in due time.

There’s a fix. The problem is that two threads can be in the middle of modifying data that takes multiple steps to complete. There’s a way to say, “Only let one thread into this section of code at a time.” This is done with the lock keyword. The following is an adjustment to the NumberWrapper class from above that uses a lock statement to ensure that access to the _number variable is always limited to just a single thread at a time:

class NumberWrapper
{
    private float _number;
    private object _numberLock = new object(); // Did I mention you can assign values to fields this way, outside of a constructor?

    public float Number
    {
        get
        {
            lock (_numberLock)
            {
                return _number;
            }
        }
        set
        {
            lock (_numberLock)
            {
                _number = value;
            }
        }
    }

    public void Increment()
    {
        lock (_numberLock)
        {
            _number++;
        }
    }
}

Every access to the field _number appears inside of a lock statement. A lock statement requires an object of some sort, inside the parentheses. Only one thread at a time can grab the lock for a given object, and it won’t release the lock for the object until the end of the lock statement is reached.

This adds a layer of protection–and totally eliminates the problem we had before.

Should you slap a lock on everything, then? No. Acquiring and releasing locks takes time, and a thread that has to wait for a lock cannot progress until the object becomes available. Over-applying the lock keyword will slow your program down significantly.

You want to apply lock statements around things where multiple threads may be simultaneously accessing the data. But if you can change your design so that data does not need to be accessed from multiple threads, you’re in a much better situation.

tip

You do not need locks around data that is read-only. It only matters if the data could be changing while multiple threads are running. Therefore, anything you can make be read-only is automatically safe to use in a multi-threaded environment.