Out of Phase: Race Conditions and Shared Mutexes


This is part of a series for the Out of Phase game project that reflects on various stages, covering pros and cons of the creative process and implementation of the game.

Last year I started porting over the backend of my game from Python to C++. The reason for this move ties into my long-term goals as a game developer. Programming a multiplayer server has been something that has intrigued me for a while. The idea of creating a virtual environment that is continuously running and allows multiple players to interact with that environment in real time is fascinating to me.

At this time, my game will only support two players, but I would like to play around with adding more.  I’m doubtful this will be a massively multiplayer game, like World of Warcraft or Elder Scrolls Online, since that would be a huge amount of effort. So maybe up to four players.

Real-time games that support multiple players typically require some special handling of synchronizing the game state as it is updated from the player’s clients. Without synchronizing, race conditions will occur which will result in erroneous and unpredictable ways.

So in this post, I’ll be covering how to avoid race conditions in C++ threads by using locks and mutexes.

Race Conditions

Let’s take a code snippet as an example (this is make believe pseudo code):

void attackGoblin(Monster* goblin) {
    int health = goblin->getHealth();
    health -= 10;
    goblin->setHealth(health);
}

Race Condition 1

Ok. So the problem here is what happens when two players are attacking this goblin at the same time. Just because this code is wrapped in a function, doesn’t mean each block of code gets executed sequentially. It’s possible that the lines of code being run between each player may be executed in a mixed order.

Let’s assume that goblin->getHealth  and goblin->setHealth  read and write the current health value from or to memory. (But they don’t use synchronization)

Two players are attacking a goblin with 500 health. Both players inflict 30 damage at the same time. We expect the goblin’s health to drop down to 440, but instead, it only drops down to 470. What happened?

(thread 1) int health = goblin->getHealth(); // getHealth returns 500
(thread 1) health -= 30; // local to thread 1
(thread 2) int health = goblin->getHealth(); // getHealth() returns 500
(thread 2) health -= 30; // local to thread 2
(thread 2) goblin->setHealth(health); // Goblin health is now 470
(thread 1) goblin->setHealth(health); // Goblin health is still 470

Where did the damage go? Well, it got overwritten because the instructions weren’t synchronized. Each thread keeps a separate copy of health and when the goblin’s health is changed in one of the threads, it never updates in the other.

Race Condition 2

One improvement we could make to the above code example is to not store the health in a local variable, and instead, just use getHealth.

(thread 1) goblin->setHealth(goblin->getHealth() - 30);
(thread 2) goblin->setHealth(goblin->getHealth() - 30);

The problem here is that reading a memory block is not automatically synchronized with writes from other threads, so it’s possible your program could read a partially written value, resulting in garbage being returned.

To understand this, we need to jump down to how this looks in memory.

First, getHealth fetches the current health in memory:

0000 0001 1111 0100 // 500 in binary

Then we subtract 30 from the returned value. The result in binary would be:
0000 0001 1101 0110 // 470 in binary

Since these two expressions are running at the same time, we could get the same value from getHealth and run into the problem described in the first race condition example. However, this could be worse. If getHealth is not synchronized, one of the threads could read the memory while the other thread is writing, resulting in an erroneous value being returns.

So thread 2 could see this:

0000 0001 1101 0100 // 460 in binary

No good, we get the wrong value.

Mutexes and Locks

There are a couple different ways to address the problem at hand. Both involve synchronizing the reads and writes between the threads so that the value in memory remains intact and accurate. I will be covering mutex and locks here, but there is also atomic types.

What are Mutexes?

Mutex is short for Mutually Exclusion Object. Mutexes are lockable objects that are synchronized between threads. If you want to imagine them as a physical key lock to a box, that works. But keep in mind mutexes are very simple in their structure, and their purpose is to keep a value indicating whether or not they have been locked.

There is more than one type of mutex. The basic std::mutex , included in the standard C++ library only allows one threads to lock it at a time. There are variations that address different problems, such as timing out after a given amount of time (timed_mutex), and allowing multiple reads to occur at the same time (shared_mutex).

Locking and Releasing a Mutex

Mutexes are used with lock functions, such as lock or try_lock. These functions can be used with the different variations of mutexes.

In general, the locking function locks the mutex. For std:mutex, this is simple, ownership and exclusion is given to the calling thread until the mutex is released (unlocked).(which will happen automatically after the lock is destroyed.

One important thing to understand is std::lock  and std::try_lock  have the sole purpose of locking the mutex. An additional object needs to be created to automatically release the mutex.

For automatic release at the end of the scoped block, std::lock_guard, std::unique_lock, or in my scenario, boost::shared_lock  can be used.

Mutex vs Shared_Mutex

A basic mutex is exclusive, meaning only one thread can lock the mutex at one time. Any other threads that encounter the lock will block until the owning thread releases the lock.

With shared mutexes, things are a little more flexible. Writes are still limited to one per thread for a mutex, but multiple reads are supported as long as there isn’t an active write. Boost provides an implementation of a shared mutex via std::shared_mutex.

If the mutex is locked for reads (one or more), then a lock for exclusive write access will block until the read locks have released. The same goes for the inverse scenario if a shared mutex has been upgraded to unique, threads trying to obtain a shared access will be locked until the unique lock is released.

Deadlocks

A lock can become a bottleneck and hang up the program, such is the case with deadlocks. Deadlocks are something you encounter when using multiple mutexes that intersect, and that is a bit out of scope here.

There are ways around deadlocks, like using std::try_lock  or a std::timed_mutex  with std::try_lock_for  or std::try_lock_until .

Putting it all Together

For my game server, I use boost::shared_mutex  and std::boost_shared_lock . Here’s a walkthrough on how the reads and writes are implemented.

First, the mutex needs to be defined. This line is in my GameState class header:

boost::shared_mutex positionMutex;

Read Lock

I need lock positionMutex whenever I’m going to read from memory that this mutex should cover. I do this by using boost::shared_lock() , which allows multiple reads without blocking as long as the mutex hasn’t been upgraded to a unique lock.

boost::shared_lock<boost::shared_mutex> lock(this->positionMutex);

Write Lock

Whenever I write to my position properties, I need to upgrade the mutex to unique access so that other threads don’t read while the new value is being written. This is done by using boost::upgrade_lock  and  boost::upgrade_to_unique_lock as shown below .

boost::upgrade_lock<boost::shared_mutex> lock(this->positionMutex);
boost::upgrade_to_unique_lock<boost::shared_mutex> unique_lock(lock);

Full Example

GameUnit.h

class GameUnit {
    ...
    float positionX;
    float positionY;
    ...
public:
    ...
    boost::shared_mutex positionMutex; // probably shouldn't be public
    ...
}

GameUnit.cpp
// Used to read unit's state to client
UnitState GameUnit::getFullState() {
	boost::shared_lock<boost::shared_mutex> lock(this->positionMutex);

	UnitState state;
	state.positionX = this->positionX;
	state.positionY = this->positionY;
	state.velocityX = this->velocityX;
	state.velocityY = this->velocityY;
	state.facing = this->facing;
	return state;
}

// Primary method to change unit's position in state
void GameUnit::updatePosition(float x, float y, FacingDirection facing) {
	boost::upgrade_lock<boost::shared_mutex> lock(this->positionMutex);
	boost::upgrade_to_unique_lock<boost::shared_mutex> unique_lock(lock);

	if (fabsf(this->positionX - x) > this->maxSpeed) {
		return;
	}


	if (fabsf(this->positionY - y) > this->maxSpeed) {
		return;
	}

	this->positionX = x;
	this->positionY = y;
	this->facing = (FacingDirection)facing;
	std::cout << "Position: " << this->positionX << ", " << this->positionY << "\n";
}

// Repositions unit on death. Circumvents movement constraints/checks
void GameUnit::moveToSpawn(float x, float y, FacingDirection facing) {
	boost::upgrade_lock<boost::shared_mutex> lock(this->positionMutex);
	boost::upgrade_to_unique_lock<boost::shared_mutex> unique_lock(lock);

	this->positionX = x;
	this->positionY = y;
	this->facing = (FacingDirection)facing;
}

Conclusion

Concurrency is an effective means of separating units of work so that large chunks of data can be processed faster or crucial tasks are not hindered by other tasks blocking. There are some challenges that come with using multiple threads that can be managed through synchronizing memory operations between the threads.

Porting your codebase to C++ and using threads isn’t something you want to do on a whim. There is some over overhead and complexity that come with the power and flexibility. Games that can run with a little delay from the backend without impacting experience, such as a turn based game, could get away with using an interpreted language such as Python or PHP. Out of Phase was initially written in Python and used interpolation to help smooth out latency.