Monday, April 18, 2011

Thread safety


Thread safety
Definition: A class is thread safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, with no additional synchronization or other coordination on the part of the calling code.
An object in general represents a state of something (forget about behavior for now). Any thread working on that object tries to change its state. Always it is desirable to change the object from one consistent state to another. However, when two or more threads operate, the object might go to an inconsistent state. Informally, any program that doesn’t let the object to go to such an inconsistent state is considered as thread safe. As a simplest case, any stateless object is always thread-safe. Stateless objects are not difficult to find. Design patterns like Strategy, Singleton, Adaptor, Flyweight, etc. uses stateless objects only and so they are almost always thread-safe.

Atomicity
Any operation that happens as a single unit is said to be atomic. There won’t be any intermediate state while doing an atomic operation; i.e. there are only two states when doing such an atomic operation, the operation will happen fully or won’t happen at all. In reality many operations which look atomic are not really atomic. For e.g., ++j though it looks like a single atomic operation, it is not. A processor executes a series of instructions to achieve the result, and a thread switch can happen at any intermediate stage which makes the statement non-atomic. Race condition occurs when the correctness of a computation depends on the relative timing or interleaving of multiple threads at runtime. In general, race conditions can happen anytime two threads simultaneously access a shared variable. Compound actions are those actions performed in multiple steps like check-then-act (E.g. lazy initialization) and read-modify-write (increment operation), and they are problematic areas when it comes to multithreading.

Lock
To preserve state consistency, update related state variables in a single atomic operation. Use synchronized block to achieve atomicity. Every java object can act as a lock for purpose of synchronization; these built-in locks are called intrinsic locks or monitor locks. In java, locks are obtained per-thread basis rather than per-invocation basis; i.e. any thread holding a lock can make any number of invocations to obtain the same lock again and they will all succeed immediately (This is called reentrancy). This is necessary as say for example, a synchronized method of an object calls another synchronized method (lock is obtained on the same object), would result in a deadlock if there is no reentrancy.

Guarding state with locks
It is a common mistake to assume that synchronization needs to be used only when writing to shared variables. For each mutable state variable that may be accessed by more than one thread, all accesses to that variable must be performed with the same lock held. In this case, we call that variable is guarded by that lock. Every shared, mutable variable should be guarded by exactly one lock.
A common locking convention is to encapsulate all mutable states within an object and to protect it from concurrent access by synchronizing any code path that accesses mutable state using the object’s intrinsic lock. For every invariant that involves more than one variable, all the variables in that invariant must be guarded by the same lock. Use @GuardedBy annotation for easy debugging.

Performance
Make the synchronized block as small as possible without breaking the required atomicity. Otherwise it will become a performance bottleneck. Also note that acquiring and releasing a lock is costly. Avoid holding locks during lengthy computations or operations at risk of not completing quickly such as network or console I/O.

I have taken these points from the book "Java Concurrency In Practice" which has very high user rating in Amazon.

No comments:

Post a Comment