Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / performance

How to Share Resources Between Threads?

3.71/5 (13 votes)
2 Aug 2018CPOL3 min read 42.7K  
Synchronization is slow, why and how to do better?

Introduction

Multi threading can improve application performance when IO read/write is involved. Unfortunately, shared resources (shared variables) can have different versions at each CPU cache. The consequence is that the application's behavior cannot be predictable. Java provides synchronized keyword to keep shared resources consistent across CPU's caches. Unfortunately again, synchronized keyword slows down the application.

I use JMH for micro benchmark with AverageTime mode, it mean the result of benchmark is average run time of each testcase, lower output is better. You can find more information about micro benchmark at this link.

Why Synchronized Slowdown Application?

When a thread gets locked and starts to execute instructions in a synchronized block, all other threads will be blocked and become idle. Execution context (CPU cache, instruction set, stack pointer ...) of those threads will be stored and execution context of other active threads will be restored to resume computing. It's called context switch and requires significant effort of the system. Task scheduler also has to run to pick which thread will be loaded.

volatile keyword

volatile keyword just does a few things: tells CPU read value of resources from main memory, not from CPU's cache; A write to a volatile field happens-before every subsequent read of that field. Volatile can never have a higher overhead than synchronized, volatile will have the same overhead with synchronized if synchronized block has only one operation.

volatile keyword works well if there is only one writing thread. If there are 2 or more writing threads, race condition will happen: all writing threads get the latest version of variable, modify value at its own CPU, then write to main memory. The consequence is that data in memory is just the output of one thread, other threads' modification were overridden.

Package java.util.concurrent

Doug Lea did great work when he created and improved this package. This package has a lot of tools for managing threads, and also contains some thread-safe data structures. Those data structures also use synchronized and volatile under the hood but in a sophisticated way, you can benefit from much better performance than writing your own code.

ConcurrentHashMap "obeys the same functional specification as Hashtable" and gives you the advantage of thread-safe.

Java
public class TestHashMap {

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Threads(10)
    public Integer concurrentHashMap(BenchMarkState state){
          Integer temp = null;
          for(int i = 0; i < 100000; i++){
              temp = Integer.valueOf(i);
              state.chm.put(temp,temp);
          }
        return temp;
    }
    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.MILLISECONDS)
    @Threads(10)
    public Integer  hashMap(BenchMarkState state){
          Integer temp = null;
          for(int i = 0; i < 100000; i++){
              temp = Integer.valueOf(i);
              synchronized (state.hm) {
                  state.hm.put(temp,temp);
              }            
          }
        return temp;
    }
    @State(Scope.Benchmark)
    public static class BenchMarkState {
        
        @TearDown(Level.Invocation)
        public void doTearDown() {
            hm.clear();
            chm.clear();
        }
        public HashMap<Integer, Integer> hm = new HashMap<>(100000);
        public ConcurrentHashMap<Integer, Integer> chm = new ConcurrentHashMap<>(100000);
    }
Benchmark                      Mode  Cnt   Score   Error  Units
TestHashMap.concurrentHashMap  avgt  200   6.424 ± 0.052  ms/op
TestHashMap.hashMap            avgt  200  41.387 ± 0.880  ms/op

AtomicInteger and other similar classes use volatile and Unsafe.compareAndSwapInt. AtomicInteger can call as busy-wait, it mean a thread always checks condition to execution. This thread does nothing but task scheduler cannot detect this check and considers this thread is busy, so that task scheduler cannot take CPU to another thread that is ready for execution. Busy-wait works well if the condition can archive after a few clocks of CPU. We can see in benchmark result that synchronized is unstable.

Java
public class TestCAS {

  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MILLISECONDS)
  @Threads(10)
  public int atomic(BenchMarkState state){
    while(state.atomic.get() < 100000)
      while(true){
        int temp = state.atomic.get();
        if(temp >= 100000 || state.atomic.compareAndSet(temp, temp + 1))
          break;
      }
    return state.atomic.get();
  }
  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MILLISECONDS)
  @Threads(10)
  public int integer(BenchMarkState state){
    while(state.integer < 100000){
      synchronized (state.integer) {
        if(state.integer < 100000)
          state.integer += 1;
      }
    }
    return state.integer;
  }
  @State(Scope.Benchmark)
  public static class BenchMarkState {

    @TearDown(Level.Invocation)
    public void doTearDown() {
      atomic.set(0);
      integer = 0;
    }
    public AtomicInteger atomic = new AtomicInteger(0);
    public Integer integer = new Integer(0);
  }
Benchmark        Mode  Cnt   Score   Error  Units
TestCAS.atomic   avgt  200  12.484 ± 0.130  ms/op
TestCAS.integer  avgt  200  15.896 ± 2.012  ms/op

lock

Lock has more flexible features than synchronized, you can use tryLock() for a specific time to wait or can make sure the longest waiting thread gets the lock with fairness option. But synchronized keyword can guarantee both execution sequence and data freshness, the source code with synchronized is also simple. Lock will be a nightmare if a junior developer forgets to call unlock() or doesn't put unlock() at finally block.

  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MILLISECONDS)
  @Threads(2)
  public Integer lock_2(BenchMarkState state){
    while(true){
        state.lock.lock();
        if(state.intLock >= 100000) {
            state.lock.unlock();
            break;
        }
        state.intLock += 1;
        state.lock.unlock();
    }
    return  state.intLock;
  }
  @Benchmark
  @BenchmarkMode(Mode.AverageTime)
  @OutputTimeUnit(TimeUnit.MILLISECONDS)
  @Threads(2)
  public Integer synchonized_2(BenchMarkState state){
    while(true){
      synchronized (state.LOCK) {
        if(state.intSync < 100000)
          state.intSync += 1;
        else
            break;
      }
    }
    return state.intSync;
  }

  @State(Scope.Benchmark)
  public static class BenchMarkState {   
    @TearDown(Level.Invocation)
    public void doTearDown() {
        intSync = new Integer(0);
        intLock = new Integer(0);
    }
    public ReentrantLock lock = new ReentrantLock();
    public Object LOCK =new Object();
    public Integer intSync = new Integer(0);
    public Integer intLock = new Integer(0);
  }
Benchmark                Mode  Cnt    Score   Error  Units
TestLock.lock_2          avgt  200   11.069 ± 0.300  ms/op
TestLock.lock_32         avgt  200   65.025 ± 1.114  ms/op
TestLock.lock_64         avgt  200  136.796 ± 3.589  ms/op
TestLock.lock_8          avgt  200   15.996 ± 0.317  ms/op
TestLock.synchonized_2   avgt  200    3.986 ± 0.059  ms/op
TestLock.synchonized_32  avgt  200   69.922 ± 1.645  ms/op
TestLock.synchonized_64  avgt  200  156.641 ± 3.948  ms/op
TestLock.synchonized_8   avgt  200   14.754 ± 0.248  ms/op

I do benchmark with different number of thread: from 2 threads to 64 threads. We can see that synchronized work much better than ReentrantLock with 2 threads, but in scenario with 64 threads, ReentrantLock has better performance.

Immutable Object

The idea is simple, if one object never changes values, it's thread-safe. But there is a problem, you have to create a new object each time you want to change some values, consequently there is overheat of GC. Some libraries can make immutable objects more easy to deal with, like https://immutables.github.io.

Conclusion

Sharing resources between threads is easy with synchronized keyword, but it can cause world wide waiting and slowdown your applications. Other simple techniques also can archive thread-safe, but are faster than synchronized.

You can check out the full source code at https://github.com/vudangngoc/java-benchmark.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)