Difference between binary semaphore and mutex

I have seen this question on stackoverflow and provided an answer to it:

There synchronization semantics are very different:

  • mutexes allows serialization of access to a given resource i.e. multiple threads wait for a lock, one at a time and as previously said, the thread owns the lock until it is done, most importantly only this particular thread can unlocks it.
  • a binary semaphore is a counter with value 0 and 1, a task blocking on it until any task does a sem_post. The semaphore advertise that a resource is available, and it provides the mechanism to wait until it is signaled as being available.

As such one can see a mutex as a token passed from task to tasks and a semaphore as traffic red-light (it signals someone that it can proceed).

There is a lot of confusion around these two concepts. They permit the same kind of feature i.e. synchronization but as I explained with different semantics.

Semaphores are usually more expensive to use than mutex, but they provide you sometimes more reliability. For example, let’s say you have two threads, thread A and thread B trying to serialize access to resource R.

The mutex way would be;

step 00 – create a lock object
step 01 – A request the lock
step 02 – B request the lock
step 03 – A grab the lock
step 04 – A uses R
step 05 – A release the lock
step 06 – B grab the lock
step 07 – B uses R
step 08 – A request the lock
step 09 – B release the lock
step 10 – A grab the lock

Q: What happens if thread B returns between step 7 and step 9 ?
A: step 10 may never occurs …

In this scenario there is no way to recover, the mutex is an opaque object and you are stuck in a deadlock situation with starvation(thanks george). If you are using C++, you may get over it using RAII, but if you use plain C, well that’s harder.

With semaphores things are a little bit different, you can make step 10 happen even if your thread leaves early. You need an external actor of course (a monitoring thread for example) that upon thread completion may resume the work flow.

Since people are confused by semaphores or misuse them more frequently, here are a few examples on how to use them.

Serializing access using a semaphore

Note that when using a binary semaphore to serialize access to a resource, you need to initialize it with the value 1, meaning resource available.

Requesting the lock is done by calling sem_wait() on it, and releasing the lock is done by calling sem_post():

// first get a semaphore and initialize it with the value 1
sem_init( … )

// lock the resource
sem_wait( …)

… do some stuff …

// release the resource
sem_post( … )

The consumer / producer model

if you initialize the semaphore with value 0 and call sem_wait on it, the semantic is different: you are waiting for some one to signal you that there is some job to do.
In this scenario, the semaphore is not used to serialize access to a particular resource, it is used as signaling mechanism.
A consumer is waiting for a producer to signal him that some data is available or that there is work to do.

very simple and effective, that’s why semaphore are used so frequently for producer/consumer work flows.

// first get a semaphore and initialize it with the value 0
sem_init( … )

while( true )
// wait for some job to do
sem_wait( …)

… work hard on it …

// yep there is no sem_post here: the producer, that gives this task
// work to do is supposed to be the one calling sem_post()

check your platform semaphore implementation, sem_post(), sem_wait() and sem_init() are the API to use POSIX semaphores.

Your platform API may be different, but the concepts are the same.

Enjoy and share.

7 oct 2009 update: as george in a comment below said: there is no dealock for thread A, no progress can be made: this is starvation situation not a deadlock. It cannot be a dealock since thread B is not there anymore …


About this entry