Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Non Kernel Semaphore

0.00/5 (No votes)
11 Jun 2012 1  
A Semaphore that does not use the kernel by default

Introduction

I've been using threads a long time in .NET and honestly the topic interests me very much. When I began to get deeper into multithreading I came to use other threading constructs rather than just the Monitor (lock) and Interlocked classes. Interlocked and volatile mechanisms are user mode constructs meaning that they do not transition into the kernel. Kernel level objects like Mutex and Semaphores are created and managed by the operating system and are system wide. If we use a semaphore in our application we must leave user mode and transition into kernel mode just to create it. There is a slight performance hit that we incur since the kernel does not trust us. While this hit is negligible it can begin to cost a lot when used in different types of scenarios.

Background

I like the functionality that the semaphore provides however in some cases I would like to use it in just one process and not transition into kernel mode and take a hit. I've decided to write a very simple non-kernel semaphone.

What it has:

  1. Fast creation time since it just a regular .Net non-kernel object
  2. Only transitions to kernal mode when the maximum capacity has been met and more threads want to use it.

What it lacks:

  1. It's not system wide which makes sense because it 'mostly' stays in user mode.
  2. It does use Monitor which can transition into kernel mode, however this only occurs when all the slots have been filled.

The code

 public class NonKernalSemaphore 
    {
        private readonly object padlock = new object();
        private readonly int maxSlots;
        private int usedSlots;

 
        public NonKernalSemaphore(int maxSlots)
        {
            this.maxSlots = maxSlots;
        }

        public void Enter()
        {
            lock (padlock)
            {
                while (usedSlots == maxSlots)
                {
                    Monitor.Wait(padlock);
                }

                usedSlots++;
            }
        }

        public void Release()
        {
            lock (padlock)
            {
                if (usedSlots > 0)
                {
                    usedSlots--;
                    Monitor.Pulse(padlock);
                }
            }
        }
    }

I wanted to keep this class as simple as possible without adding fluff so that I could cater for the basic usage scenario.

NonKernelSemaphore takes the maximum number of threads allowed at one time as constructor parameter maxSlots. It also keeps a count of how many threads are currently using the semaphore using the usedSlots field.

Enter Method Explanation

When a consumer calls the Enter() method a lock, padlock, is first accquired. We now check if there are any slots remaining in the semaphore. Obviously if usedSlots is equal to the maxSlots then all the slots are occupied. In this case we wait until another thread releases the slot by calling the Release() method. By calling Monitor.Wait() we transition into kernel mode and let the thread scheduler take care of things. Our scope at this point is still only process wide.

Release Method Explanation

Again in this method we must first call lock on the shared padlock object. We now decrease the usedSlots count which releases a slot in the semaphore. Then we awaken all the threads who are waiting on a slot by calling Monitor.PulseAll(). When the threads wake up they must first check if all the slots are used again hence the while condition (usedSlots == maxSlots).

Optimization

One optimization I was considering is only calling Monitor.Pulse() only if there are threads waiting for an available slot. This is definitely better than blindly calling the method. In this case we just need to keep track of how much many threads are waiting by introducing a new variable, waitingCount of type int incrementing it whenever a thread is waiting. The Release() method will then check if there are threads waiting. If there is then call Monitor.Pulse() otherwise just decrease usedSlots as usual. Here is the code:

 

 public class NonKernalSemaphore 
    {
        private readonly object padlock = new object();
        private readonly int maxSlots;
        private int usedSlots;
        private int waitingCount;

        public NonKernalSemaphore(int maxSlots)
        {
            this.maxSlots = maxSlots;
        }

        public void Enter()
        {
            lock (padlock)
            {
                if (usedSlots == maxSlots)
                {
                    waitingCount++;

                    do
                    {
                        Monitor.Wait(padlock);

                    } while (usedSlots == maxSlots);

                    waitingCount--;
                }

                usedSlots++;
            }
        }

        public void Release()
        {
            lock (padlock)
            {
                if (usedSlots > 0)
                {
                    usedSlots--;

                    if (waitingCount > 0)
                    {
                        Monitor.Pulse(padlock);
                    }
                }
            }
        }
    }





Updates

Originally I used Monitor.PulseAll() however a Code Project reader ran some tests found that Pulse() was a bit faster.

Tests

Machine: Windows 7 Professional (64), Intel Xeon(R) 2.4 Ghz, 12 GB RAM, .Net 4.0, Release Mode

Test 1

Creation Time (Milliseconds) after 1 million iterations:

NonKernelSemaphore: 76

Semaphore: 2263

SemaphoreSlim: 623

Test 2

Enter/Wait/Release Time

Max concurrent slots: 8

Number of threads: 64

My test procedure included something like:

       public static void DoNonKernel()
        {
            nks.Enter();
            
            nksc++;
            nks.Release();
        }

        public static void DoSemaphore()
        {
            s.WaitOne();
            
            sc++;
            s.Release(1);
        }

        public static void DoSemaphoreSlim()
        {
            sl.Wait();
            
            slc++;
            sl.Release();
        }

I ran this a few times. Here are the results in milliseconds:

SemaphoreSlim:453.0453
Semaphore:241.0241
NonKernelSemaphore:249.0249

SemaphoreSlim:288.0288
Semaphore:249.0249
NonKernelSemaphore:249.0249

SemaphoreSlim:278.0278
Semaphore:264.0264
NonKernelSemaphore:285.0285

SemaphoreSlim:235.0235
Semaphore:273.0273
NonKernelSemaphore:246.0246

SemaphoreSlim:226.0226
Semaphore:237.0237
NonKernelSemaphore:217.0217

SemaphoreSlim:228.0228
Semaphore:224.0224
NonKernelSemaphore:381.0381

Points Of Interest

Feedbacks welcomed!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here