Doing things with multiple threads is hard. Here are ways to make it easier, clearer and, most importantly, safe and performant.
Check the rest of the series here.
Item 78: Synchronize access to shared mutable data
synchronized
keyword does not only mutually exclude threads so they don’t see an object in an inconsistent state, but it also ensures that a thread entering a synchronized method or a block sees the latest changes (guarded by the same lock) of an object done by other threads previously.
“Synchronization is required for reliable communication between threads as well as for mutual exclusion.” The reasons for this are specified in the Java Memory Model, which defines when a write to a variable is seen by other threads. When synchronizing some shared mutable state, it is important to synchronize variable reads as well as writes. If you only need to ensure that a thread always sees the up to date value of a variable without mutual exclusion then you can decorate that variable with the volatile
keyword.
When the inter-thread communication is needed with some atomic operations on primitives you can use the types from java.util.concurrent.atomic
package (AtomicLong
, AtomicInteger
…).
The pain of the absence of proper synchronization is that it’s difficult to debug because it is time, VM implementation and hardware dependent, nothing you can control or predict. Generally, almost impossible to reproduce. To avoid all of this, the best thing is to share immutable data, or not share at all, or “confine mutable data to a single thread“.
Item 79: Avoid excessive synchronization
To avoid deadlocks, data corruption or other exceptions, avoid calling alien methods from your synchronized blocks. Alien methods are those which can be overridden or which can be provided as a function object (lamdas). They are alien to the class because it is not aware of what they can do and has no control over them. Besides the problems mentioned, if the alien method does some long-running task, that can make other threads wait for long for the release of the lock occupied by that method.
The best thing to do in these cases is to move an alien method outside of the synchronized block, meaning calling a method without holding the lock. This is known as an open call.
“As a rule, you should do as little work as possible inside synchronized regions.” When you get the lock, examine the shared data, do the things you need with it and release the lock as soon as possible.
Oversynchronization will hinder the performance and it is not the CPU time getting the lock but the contention – delays of ensuring that all the CPU cores have a consistent view of memory (high degree of cache coherence) and preventing the JVM to do its magic with optimizing our code.
Generally, when writing mutable class there are two options: let the client do the synchronization if that is needed or make the class itself thread-safe. Only if there is a good reason for it and you can achieve higher concurrency, go with the second option. Either way, document your decision clearly.
Item 80: Prefer executors, tasks, and streams to threads
To put it simply, you should love java.util.concurrent
package and it’s Executor Framework if you deal with threads in any way. You can really do a lot of things with executors and it is very powerful. The only thing you have to do is to choose the right ExecutorService for your application. If none of the factory methods on java.util.concurrent.Executors
can provide what you need, then simply use ThreadPoolExecutor
class directly for maximum control.
Apart from staying away from home-made work queues, you should avoid dealing with threads directly altogether. Thread
is a unit of work and a mechanism of executing it at the same time. The key benefit of Executor Framework is abstracting the unit of work (a task) from the execution engine thus giving you great flexibility in selecting appropriate setup and policies of handling that work.
Thread pool that will run Java Fibers (project Loom) will actually be some Executor
.
Item 81: Prefer concurrency utilities to wait and notify
“Given the difficulty of using wait and notify correctly, you should use the higher-level concurrency utilities instead.” As with the previous item, you should learn and love java.util.concurrent
package, which generally provides three types of utilities: Executor Framework, concurrent collections, and synchronizers.
Concurrent collections are concurrent implementations of standard Java collection interfaces like List
, Queue
and Map
and are synchronized internally (no need for a client to do any synchronizations). With concurrent collections, you get state-dependent modify operations which combine several operations into a single atomic one, like Map’s putIfAbsent(key, value)
. Concurrent collections provide much smarter locking techniques, for example, ConcurrentHashMap
does locking at hashmap bucket level rather locking the whole map like SynchronizedMap
does. Synchronized collections are now largely absolute, always use concurrent collections for any new code.
When your threads need to wait for one another, stay away from wait
and notify
methods, but utilize synchronizers (like CountDownLatch
, Semaphore
, CyclicBarrier
and Phaser
from java.util.concurrent
package) which provide nice and powerful tools for thread coordination. With synchronizers at our disposal, using wait
and notify
directly is like programming in “concurrency assembly language“.
Item 82: Document thread safety
Failing to document how a class behaves when used concurrently can leave a client assuming of the behavior and when those assumptions are wrong, the program can perform in a non-deterministic manner. synchnronized
keyword is an implementation detail and you should not rely on it to serve as documentation of thread safety. It will also not be included in the Javadoc output.
“To enable safe concurrent use, a class must clearly document what level of thread safety it supports.” Thread safety levels are:
- Immutable – Class can not be changed once initialized (
String
,BigInteger
). - Unconditionally thread-safe – Class is mutable but sufficiently synchronized internally (
ConcurrentHashMap
,AtomicInteger
). - Conditionally thread-safe – Same as unconditionally thread-safety but some methods require external synchronization (wrappers returned by
Collections.synchronized
). - Not thread-safe – Client of the class has to do the synchronization.
- Thread-hostile – Even with synchronization, the class is unsafe for concurrent use. These are typically written with insufficient consideration for concurrency.
Use appropriate thread-safety annotations for corresponding thread-safety levels (Immutable
, ThreaSafe
, NotThreadSafe
). Pay great attention with documenting conditionally thread-safe class and indicate which invocation sequences require external synchronization. Also, you can use GuardedBy
annotation to document which variables are guarded by which locks.
Item 83: Use lazy initialization judiciously
Lazy initialization is primarily an optimization technique and like with any optimization, you should not do it unless you really have to. “Under most circumstances, normal initialization is preferable to lazy initialization.”
In a multithreaded environment, lazy initialization is tricky and if you can not avoid it use appropriate techniques to achieve it: for instance fields use double-check idiom or single-check idiom if you can tolerate repeated field initialization and, for static fields, utilize lazy initialization holder idiom.
Item 84: Don’t depend on the thread scheduler
Thread scheduler will decide which runnable thread to run and for how long and we in the Java land can’t do anything about it except declaring our wishes with a method like Thread.yield
. Every program which depends on the thread scheduler in any way will be unpredictable and will not be portable.
When in a need to clear some CPU time for other threads, “resist the temptation to “fix” the program by putting in calls to Thread.yield
“. Even worse, don’t tweak thread priorities. Thread.yeald
and thread priorities only give hints to the thread scheduler and that is it, the scheduler will acknowledge them whenever it likes. When in a situation like this then better thing to do is restructuring your application so you can avoid it.