Synchronization of Game Threads

Synchronization of Game Threads

Written by Robert Dunlop
Microsoft DirectX MVP

Introduction

In this article, we will discuss the creation of multithreaded applications, and how they can benefit the performance of an entertainment title. We will explore the role that threads play in performing parallel tasks within a single process, and explore the techniques that must be used to keep them in sync with an application.

This is the first installment of a short series on multithreading. We will cover the basics of multithreading in this issue, learning how to implement and control worker threads. Later issues will cover complex synchronization and performance issues, as well as providing practical samples with source code.

What is Multithreading?

When we create an application, we create a "process". A process creates and maintains its own windows, has its own set of resources and memory, and can receive and process messages from Windows.

A process also maintains one or more "threads". Threads are essentially tasks that are able to be run independently of each other, while still functioning as part of the process. They share resources with the process, but may also have their own private data that is specific to the thread.

The benefit of having multiple threads per process is that they can run "concurrently", meaning at the same time. This is not exactly true, because the processor can only handle one thread at any given time. It gives such an appearance, though, because it can quickly swap back and forth between tasks, allowing tasks to run in parallel.

There are two varieties of threads - User Interface Threads and Worker Threads. User interface threads owns one or more windows, and maintain their own message loop to receive and process messages that are posted to the thread's windows. These threads are often used for creating control interfaces that are independent of the main application thread, such as a dialog box or dynamic control.

In the full screen game environment, these generally won't be a lot of use to us. Thus, this article will focus on the creation and use of Worker Threads. A worker thread is an asynchronous thread that has no message queue of its own. When creating a worker thread, we provide a pointer to a function. When the thread is started, this function is called, and when the function ends the thread is released. In that regard, the thread function acts similar to the WinMain() function under Win32.

How do we Benefit? (And when do we Lose?)

The usefulness of multithreading may not be immediately apparent, and indeed there are many cases where additional threads would only add to the overhead of a program. A typical example of this would be an application with a serial flow of operations, where each operation must complete before the next can begin:

Such a program has nothing to gain from multithreading, because there is no "overlap" in the tasks being performed. Even if we place each task in its own thread, they will still have to wait for the proceeding thread to complete before starting, and cannot start again till the next time the preceding task completes.

So what kind of tasks do benefit from a multi-threaded implementation? There are quite a few cases, and we will try to cover a few of them in this article. Typically, these tasks have one or more of the following properties:

	Background tasks that do not need to be implemented on a per frame basis.
	Tasks that represent a variable processor load, and can take advantage of dynamic allocation of processor time.
	Tasks that could be implemented at several possible positions within the overall process.
	Tasks that could possibly be started before the current frame is completed, allowing execution while other tasks are waiting on hardware.

Creating a Thread

In a bit, we will examine typical applications that can benefit from multithreading. In the meantime, let's take a look at how a worker thread is created and utilized within an application.

The worker thread is implemented as a function that will be called when we create and execute the thread. The thread will stay in existence until the function returns, running concurrent with other threads on the system:

void worker(void *data) {

// do thread stuff here

// return when ready to release the thread

return;

}

The function takes a single void pointer as a parameter. This pointer contains application specific data that we pass to the thread function when we create the thread. The use of this parameter is optional. Often it is used to pass a pointer to a data structure that the worker thread will operate upon.

To start the thread, we make a call to _beginthread(), passing a pointer to the function, the required stack size (or 0 to use the application default), and a pointer to be passed to the thread function. The function returns a thread handle if successful, or -1 on failure.

handle=_beginthread(worker,0,(void *) &dat);

Multiple Thread Instances

A worker thread exists in the same address space as the application, and thus is able to have shared access to global variables. Any variables declared locally within the thread function however, exist in "thread local storage" - that is, they are private variables owned by the thread. In fact, multiple threads can be created from the same thread function, and they will each have their own storage space for local variables that will be untouched by other threads (note that variables defined static are shared among threads).

The ability to pass a parameter to the thread on creation can be quite useful for utilizing this capability. By passing a pointer to a class instance or other data to be operated upon, a unique instance of the thread function can be created that will handle the object. Once the object is no longer needed, the thread can shut down, removing the processor and memory requirements of handling the object. This allows for a very dynamic handling method for a variable number of objects.

Destroying a Thread

Once we are done with a worker thread, it needs to be destroyed properly. If this is not done, memory may not be released properly, and worse yet orphaned threads can result which may attempt to communicate with interfaces that are no longer valid.

There are two basic ways that a worker thread is terminated:

A thread may terminate itself, by calling _endthread() within the thread procedure. Note that this automatically is called when returning from the thread procedure, so it is not necessary to call this function on exit from the function.
The thread may be terminated externally. This method should not normally be used in an application, as it can result in allocated resources not properly being released.

Sharing Global Data and Objects

Next, we will take a look at the biggest challenge in dealing with threads - making them work in harmony with each other, while dealing in a safe manner with shared objects.

As we have previously discussed, global variables are shared between the application and threads. However, making threads "play nice" together takes more than just declaring global variables and using them in your threads.

There are many ways that this can be achieved. For the scope of this article, we will explore three common scenarios:

Variables can be declared as volatile to inform the compiler that they may be modified by other threads.
Access control objects, such as Mutex and Semaphore objects, can be used to control access to shared resources.
Events can be used to synchronize the operation of threads, ensuring that operations that are interdependent occur in the proper order.

Using Volatile to Ensure Data Synchronization

When working with global variables in a multi-threaded environment, one thing that we will often fight is the compiler's attempts to optimize memory access. For example, when a variable is frequently used in a function, it may be stored in a register for local access. However, if another thread modifies this variable, the value in the register will not reflect this change.

We could prevent this by turning off the appropriate compiler switches, but this will prevent such optimizations throughout your application. Ideally, there should be a way to handle this on a per-case basis, rather than turning off optimizations across the board.

And there is - the volatile keyword notifies the compiler that a variable is subject to modification by outside forces. Some of you may remember using this in declarations back in the DOS days, when dealing with variables that were modified by interrupt driven routines. In the multi-threaded environment, this keyword is still in use, and fits the bill in some cases.

To use the volatile keyword, we simply insert it in front of a variable declaration:

LONGLONG timer;

becomes

volatile LONGLONG timer;

While the volatile declaration will insure proper update of variables, the amount of protection it affords is limited. It does not, for example, provide a safe means for dealing with access to member functions of a globally accessed class instance. Typically it is used in cases where a variable will be written by a single thread, such as a score counter, and read by one or more other threads.

Synchronization Objects

Executing multiple threads in an application can be quite a challenge, as access to data and scheduling of time critical tasks must be properly managed. To assist in this, there are a variety of classes available under Win32 known as "synchronization objects". In the remainder of this article, we will look at two such objects:

Mutex	Prevents access to data by more than one process at a time.
Event	Provides a means for threads to be notified of events.

Using Mutex for Data Protection and Synchronization

The Mutex object is used to protect data from access by more than one thread or process at a time. Whenever a protected piece of data needs to be accessed, you tests whether it is available by attempting to "take ownership" of the Mutex. Normally, this is done by using one of the Wait functions, which will return when the object is available. Until then execution of the waiting thread is blocked, preventing the thread from executing until it has received ownership of the Mutex object.

Once ownership has been attained, your thread is free to access the protected data. Any other threads attempting to take ownership of the Mutex object will themselves be blocked, until such time as ownership of the Mutex is released. Once access to the data is complete, the Mutex should be released immediately to allow access by other threads.

To create a Mutex object, a call is made to CreateMutex(), which takes three parameters:

LPSECURITY_ATTRIBUTES	Set access parameters for inheriting this object from child processes. Set to NULL for default descriptor.
BOOL bInitialOwner	If TRUE the object will be initially owned by the creating thread. Otherwise, the object is initially unowned.
LPCTSTR lpName	Name of object. This can be used to allow multiple processes to share a Mutex object, by referring to it by name. May be NULL.

To take ownership of a Mutex object, and thus control of the data you wish to associate with it, a call is made to WaitForSingleObject(), which takes the object handle and a timeout period in milliseconds (or INFINITE for no timeout). This function will return once it has ownership of the object, or if the timeout period expired.

Once ownership is attained, you are free to modify the protected data. When finished, call ReleaseMutex() with the handle of the object to allow other threads to take ownership. Below is an example of usage of a mutex object for protection of data:

Global Variables

// handle for mutex object

HANDLE hMutex;

// data to be protected

int data[500];

Initialization Code

// create the mutex object

hMutex=CreateMutex(NULL,FALSE,NULL);

Code to Access Protected Data

// attempt ownership of mutex

if (WaitForSingleObject(hMutex,INFINITE)==WAIT_OBJECT_0) {

// access data[] here

// release ownership of the mutex object\

ReleaseMutex(hMutex);

}

Cleanup Code

// destroy the mutex object

CloseHandle(hMutex);

Caution: Be careful when using Mutex and other synchronization objects, as deadlocks can be created if interdependencies are created between threads, causing them each to wait on the processing of another blocked thread.

Using Events for Synchronization

In the last section, we discussed how to keep threads from attempting to access data simultaneously. In addition to insuring that a thread will not operate at certain time, it is often equally important to insure when a thread will operate.

To do this, we use a mechanism known as "events". Events allow us to trigger thread operations, and can be useful for such tasks as pacing a worker thread against foreground tasks and synching them versus the frame rate of a rendering engine.

The use of events is similar to those used for Mutex objects. We begin by creating an event object, and recieve a handle to the event. The WaitForSingleObject() function can then be used to cause a thread to be blocked pending the state of the event being signaled.

Creating the Event

Just as there is a specialized function to create Mutex objects, there is a similar function for the creation of events:

event=CreateEvent(NULL,FALSE,TRUE,NULL);

The CreateEvent() function takes four parameters:

lpEventAttributes	Pointer to a SECURITY_ATTRIBUTES structure that determines whether child processes can inherit the even. Normally, if the event does not need to be shared with child processes, set this to NULL and the default security descriptor will be used.
bManualReset	Determines whether event uses manual or automatic reset. If TRUE, after setting the event with SetEvent() the event will remain signaled until ResetEvent() is called. If FALSE, the event will be reset automatically after the first waiting thread is released. Use manual reset events to trigger multiple threads off of a single event.
bInitialState	Determines if event is initially signaled.
lpName	Pointer to a string containing a name for the event. If not shared among processes, simply set to NULL to use without an event name.

Waiting for an Event

For a thread to wait for an event, the WaitForSingleObject() function is used, similar to the operation with Mutex.

WaitForSingleObject(event,INFINITE);

However, unlike Mutex, the thread does not take "ownership" of the event. Instead, the thread is triggered when another thread signals the event, and if automatic reset was set the event will automatically be cleared.

Triggering an Event

To trigger a thread that is waiting for an event to be signaled, use the SetEvent() function:

SetEvent(event);

Event triggering can be used, for example, to trigger an event once per frame. Note that this does not immediately call the thread, but instead enables the thread to run on the next timeslice available to it. Its execution is also not guaranteed within a given time frame, so this would not guarantee that it is ran before the next frame completes. It simply provides a means to insure that the thread is ran only once per frame.

If guaranteed operation is required, a pair of events can be used to form an interlock:

HANDLE eventStart;
HANDLE eventDone;

...

// in initialization code

eventStart=CreateEvent(NULL,TRUE,TRUE,NULL);
eventDone=CreateEvent(NULL,FALSE,TRUE,NULL);

...

// in worker thread

WaitForSingleObject(eventStart,INFINITE);

// perform thread operation

// signal completion

SetEvent(eventDown);

...

// in frame rendering code

// insure operation completed

WaitForSingleObject(eventDone,INFINITE);

// perform frame functions...

// thread function can run anytime after this, signal the event

SetEvent(eventStart);

This site, created by DirectX MVP Robert Dunlop and aided by the work of other volunteers, provides a free on-line resource for DirectX programmers.

Special thanks to WWW.MVPS.ORG, for providing a permanent home for this site.

Visitors Since 1/1/2000: Hit Counter
Last updated: 07/26/05.

federation-glossy