Memory Model
JMM, happens-before relationship, visibility and atomicity
Java Memory Model
The JMM (JSR-133) defines the rules by which threads communicate through memory â specifying exactly when a write by one thread becomes visible to reads by another.
1. Definition
What is it?
The Java Memory Model (JMM), standardised by JSR-133 and introduced in Java 5, is the formal specification that governs how threads interact through shared memory. It answers one key question:
"When is a value written by Thread A guaranteed to be visible to Thread B?"
Without a well-defined memory model, the JVM, the compiler (JIT), and the CPU are all free to reorder operations and cache values in ways that improve single-threaded performance but break multi-threaded correctness.
Why does it exist?
Modern hardware and compilers apply many optimisations that are invisible at the source level:
- CPU caches: each core has its own L1/L2 cache; writes may not immediately reach main memory.
- Store buffers: a write issued by a CPU may sit in a store buffer before becoming globally visible.
- Instruction reordering: both compilers (JIT) and out-of-order CPUs may execute instructions in a different order than written, as long as the single-threaded result is unchanged.
These optimisations are safe in a single-threaded world. In a multi-threaded world, they can cause one thread to see a stale or partial view of another thread's work â leading to data races and visibility bugs.
Where does it fit?
The JMM sits between the Java language/libraries and the underlying hardware. It is the contract that:
- Library authors (like
java.util.concurrent) rely on to build safe abstractions. - Application developers invoke implicitly whenever they use
volatile,synchronized, orjava.util.concurrentprimitives. - The JVM implementor must uphold when compiling to native code on any hardware platform.
2. Core Concepts
2.1 The Problem Without JMM
Consider two threads sharing variables x and flag:
Example timeline without synchronization:
- Thread 1 writes
x = 1. - Thread 1 writes
flag = true. - Thread 2 observes
flag == true. - Thread 2 may still observe
x == 0because visibility is not guaranteed.
Without synchronisation:
- The compiler may reorder the two writes in Thread 1 (flag before x).
- Thread 2 may read from its CPU cache and see
flag = truebut stillx = 0. - The CPU's store buffer may flush writes in a different order.
Any of these can happen on real hardware (especially ARM and POWER).
2.2 Happens-Before (HB) Relationship
The happens-before relationship is the core of the JMM. If action A happens-before action B, then:
- All side-effects of A (and everything that A happens-before) are visible to B.
- The JVM/CPU must not reorder them in a way that violates this guarantee.
HB is not wall-clock time ordering. A can happen-before B even if B executes on a different CPU nanoseconds earlier â it is a logical ordering guarantee.
Built-in HB Rules
| Rule | Description |
|---|---|
| Program order | Each action in a thread happens-before every subsequent action in that same thread. |
| Monitor unlock | An unlock of a monitor happens-before every subsequent lock of that same monitor. |
| Volatile write | A write to a volatile field happens-before every subsequent read of that field. |
| Thread start | thread.start() happens-before any action in the started thread. |
| Thread termination | All actions in a thread happen-before any thread that detects its termination via join() or isAlive(). |
| Interruption | A call to interrupt() happens-before the interrupted thread detects the interrupt. |
| Finalizer | Completion of a constructor happens-before the start of the finalizer for that object. |
| Transitivity | If A HB B and B HB C, then A HB C. |
Visualising HB
HB via monitor rule:
- Thread 1 writes
x = 1. - Thread 1 unlocks the monitor.
- Thread 2 locks the same monitor.
- Thread 2 must observe
x = 1when reading it.
2.3 Memory Architecture and the Visibility Problem
Without synchronization, visibility usually depends on two layers:
- Thread-local cache or registers â a thread may keep reading stale values from here.
- Main memory â the updated value may already exist here, but other threads are not guaranteed to observe it immediately.
This is why Thread 2 can keep seeing an outdated value even after Thread 1 has already written the new one.
Without a happens-before edge between the write in Thread 1 and the read in Thread 2, the JMM makes no guarantee that Thread 2 will ever see the updated value. This is the visibility problem.
2.4 Atomicity
Atomicity means an operation is performed as a single, indivisible unit â no other thread can observe a partially-completed state.
| Operation | Atomic? | Notes |
|---|---|---|
int / boolean / byte / short / char / float read/write |
â | Guaranteed by JLS |
long / double read/write |
â ïž | NOT atomic on 32-bit JVMs (two 32-bit operations) |
long / double declared volatile |
â | volatile forces atomic 64-bit access |
i++ (any type) |
â | Read-modify-write: three separate operations |
AtomicInteger.incrementAndGet() |
â | Uses CAS (compare-and-swap) hardware instruction |
Key insight: Even if a field is
volatile, the compound operationi++is not atomic. It readsi, increments, then writes back â another thread can interleave between the read and write.
2.5 `volatile`
Declaring a field volatile provides two guarantees:
- Visibility: A write to a
volatilefield happens-before any subsequent read of that field (volatile write â HB â volatile read). - Reordering prevention: The JVM inserts memory barriers around volatile accesses, preventing the compiler and CPU from reordering ordinary reads/writes across a volatile access.
What volatile does NOT guarantee:
- Atomicity of compound operations (
flag++,lazyInit = new Heavy()in DCL without volatile is broken, etc.) - Mutual exclusion
When to use volatile:
- Simple status flags read by multiple threads (e.g.,
volatile boolean running) - Publishing an immutable object reference (one writer, many readers)
- Double-checked locking (DCL) pattern â the reference field must be volatile
2.6 `synchronized`
synchronized provides both:
- Mutual exclusion (atomicity of blocks): Only one thread holds the monitor at a time.
- Visibility (happens-before): All writes before
unlockare visible to any thread that subsequently acquires the same lock.
Thread 1 Thread 2
synchronized(lock) { synchronized(lock) {
x = 1; ââââ HB âââââș read x â 1 â
y = 2; read y â 2 â
} }
2.7 Memory Barriers / Fences
volatile and synchronized compile down to memory barrier instructions that prevent the CPU from reordering loads and stores across the barrier.
| Barrier type | Effect |
|---|---|
| LoadLoad | No load may be reordered before a preceding load |
| StoreStore | No store may be reordered before a preceding store |
| LoadStore | No store may be reordered before a preceding load |
| StoreLoad | No load may be reordered before a preceding store â the most expensive |
A volatile write inserts a StoreStore barrier before and a StoreLoad barrier after.
A volatile read inserts a LoadLoad + LoadStore barrier after.synchronized effectively inserts a full fence on lock acquisition and release.
3. Practical Usage
When to Use `volatile`
- Simple boolean flags or status indicators (
volatile boolean shutdown) - Publishing a single immutable object reference safely
- The reference field in double-checked locking (DCL)
- Counters where you only need visibility (one thread writes, others only read)
When to Use `synchronized`
- Any situation requiring mutual exclusion (check-then-act, read-modify-write)
- When multiple fields must be updated atomically as a group
- When
volatilealone is insufficient (compound operations)
When to Use `AtomicXxx`
- High-contention single-variable counters or accumulators
- CAS-based non-blocking algorithms
- Prefer
AtomicInteger,AtomicLong,AtomicReferenceovervolatile+ manual CAS
When to Use Immutable Objects
If an object is immutable (all fields final, set in constructor, no escaping this), it is safely published to all threads via any mechanism â including plain assignment. The JMM special-cases final fields: their values are guaranteed visible after the constructor completes.
Safe Publication
An object is safely published when the reference to it is made visible to other threads through a properly synchronised mechanism:
| Mechanism | Why it's safe |
|---|---|
static initialiser |
Class loading is synchronised by the JVM |
final field |
Frozen after construction by JMM guarantee |
volatile field |
Volatile write HB volatile read |
| Properly locked field | Monitor rule |
java.util.concurrent collections |
Internal use of volatile/locking |
Double-Checked Locking (DCL)
DCL is a common singleton pattern. Without volatile it is broken prior to Java 5:
// â BROKEN â reference may be seen partially initialised
class Singleton {
private static Singleton instance;
public static Singleton getInstance() {
if (instance == null) { // check 1 (no lock)
synchronized (Singleton.class) {
if (instance == null) { // check 2 (with lock)
instance = new Singleton(); // reordering possible!
}
}
}
return instance;
}
}
new Singleton() is three operations: allocate, initialise fields, assign reference. The JIT can reorder assign before initialise. Another thread may see a non-null but incompletely initialised object.
// â
CORRECT â volatile prevents reordering of assignment
class Singleton {
private static volatile Singleton instance;
public static Singleton getInstance() {
if (instance == null) {
synchronized (Singleton.class) {
if (instance == null) {
instance = new Singleton();
}
}
}
return instance;
}
}
4. Code Examples
Example 1 â Visibility Bug (Infinite Loop)
// Without volatile, this loop may never terminate!
public class VisibilityBug {
private static boolean running = true; // â not volatile
public static void main(String[] args) throws InterruptedException {
Thread worker = new Thread(() -> {
while (running) { /* spin */ }
System.out.println("Stopped.");
});
worker.start();
Thread.sleep(100);
running = false; // Thread 1 writes, but worker may never see it!
System.out.println("Set running = false");
}
}
Fix: declare private static volatile boolean running = true;
Example 2 â Broken vs Correct Double-Checked Locking
// â Broken DCL â missing volatile
class BrokenSingleton {
private static BrokenSingleton instance;
public static BrokenSingleton get() {
if (instance == null) {
synchronized (BrokenSingleton.class) {
if (instance == null) instance = new BrokenSingleton();
}
}
return instance; // May return partially initialised object!
}
}
// â
Correct DCL â volatile on instance
class CorrectSingleton {
private static volatile CorrectSingleton instance;
public static CorrectSingleton get() {
if (instance == null) {
synchronized (CorrectSingleton.class) {
if (instance == null) instance = new CorrectSingleton();
}
}
return instance;
}
}
Example 3 â volatile Counter Pitfall (Not Atomic!)
public class VolatileCounter {
private volatile int count = 0; // â volatile does NOT make ++ atomic!
public void increment() {
count++; // read â increment â write (3 ops, can race)
}
public int get() { return count; }
}
// With 1000 threads each calling increment() once, final count < 1000 is possible!
Example 4 â Correct Atomic Counter
import java.util.concurrent.atomic.AtomicInteger;
public class AtomicCounter {
private final AtomicInteger count = new AtomicInteger(0);
public void increment() {
count.incrementAndGet(); // â
CAS-based, lock-free, atomic
}
public int get() { return count.get(); }
}
Example 5 â Safe Publication via Final Fields
// Immutable object: safe to publish via any reference
public final class ImmutablePoint {
private final int x;
private final int y;
public ImmutablePoint(int x, int y) {
this.x = x;
this.y = y;
// After constructor returns, all readers guaranteed to see x and y
}
public int getX() { return x; }
public int getY() { return y; }
}
// Even a plain (non-volatile) assignment is safe for immutable objects
// published via static initializer:
public class Config {
public static final ImmutablePoint ORIGIN = new ImmutablePoint(0, 0); // â
safe
}
Common Pitfall â Checking Flag Without Synchronisation
// â race condition: check-then-act without atomicity
if (!map.containsKey(key)) {
map.put(key, computeValue()); // another thread may have inserted between check and put
}
// â
Use ConcurrentHashMap.computeIfAbsent for atomic check-then-put
map.computeIfAbsent(key, k -> computeValue());
5. Trade-offs
| Aspect | volatile |
synchronized |
AtomicXxx |
|---|---|---|---|
| ⥠Performance | Low overhead, ~memory barrier only | Higher â lock acquisition/release involves OS or spin | Lowâmedium â CAS may retry under contention |
| đ Mutual exclusion | â None | â Yes | â Per-variable (CAS) |
| đïž Visibility | â Yes | â Yes | â Yes |
| đą Compound ops | â Not atomic | â If in same block | â
Per-method (e.g. compareAndSet) |
| đŸ Memory | Minimal | Monitor object overhead | Object per variable |
| đ§ Maintainability | Simple for flags | Clear intent, familiar | Good for counters/references |
| đ Scalability | High â no blocking | Contention reduces throughput | High â non-blocking algorithms |
False sharing is a hidden performance problem: two volatile fields in the same CPU cache line cause the entire cache line to be invalidated on every write, even from different threads accessing different variables. Use padding or @jdk.internal.vm.annotation.Contended to separate hot fields.
6. Common Mistakes
â Mistake 1: Assuming `volatile` makes compound operations atomic
// â volatile does NOT make ++ atomic!
private volatile int counter = 0;
public void increment() { counter++; } // DATA RACE
// â
Use AtomicInteger
private final AtomicInteger counter = new AtomicInteger(0);
public void increment() { counter.incrementAndGet(); }
â Mistake 2: Double-checked locking without `volatile`
// â Broken before Java 5, and still incorrect â JIT may reorder
private static Resource instance;
// ...
if (instance == null) { synchronized(...) { if (instance == null) instance = new Resource(); } }
// â
Must declare volatile
private static volatile Resource instance;
â Mistake 3: Sharing mutable state without any synchronisation
// â Both threads access 'list' with no sync â ConcurrentModificationException / data loss
List<String> list = new ArrayList<>();
// Thread 1: list.add("a");
// Thread 2: list.add("b");
// â
Use thread-safe collection
List<String> list = Collections.synchronizedList(new ArrayList<>());
// or
List<String> list = new CopyOnWriteArrayList<>();
â Mistake 4: Over-synchronizing
// â Locking on every read of an immutable value â unnecessary contention
public synchronized String getImmutableConfig() { return config; }
// â
Immutable / final fields need no locking
private final String config = "value";
public String getConfig() { return config; }
â Mistake 5: Confusing happens-before with wall-clock ordering
// WRONG mental model: "Thread 1 writes before Thread 2 reads, so Thread 2 sees it"
// HB is a LOGICAL guarantee, not a time guarantee.
// Without a HB edge (volatile/synchronized/etc.), there is NO visibility guarantee
// regardless of how Thread 1 ran earlier in wall-clock time.
7. Senior-level Insights
JSR-133 and the Java 5 Rewrite
The original Java Memory Model (JDK 1.0â1.4) was widely recognised as broken â it couldn't even guarantee correct behaviour for double-checked locking. JSR-133 rewrote the JMM for Java 5, introducing the happens-before formalism, the strengthened volatile semantics, and the final field guarantees that underpin modern concurrent Java.
CPU Memory Models: x86 TSO vs ARM
The JMM is hardware-agnostic, but its implementation cost varies by CPU:
- x86 (TSO â Total Store Order): x86 already has a relatively strong memory model.
volatilereads are essentially free (just a load); onlyvolatilewrites need aLOCK XCHGorMFENCE. This is why many JMM bugs only manifest on ARM or POWER. - ARM/POWER (weak memory model): Requires explicit
dmb/syncbarrier instructions for both reads and writes, makingvolatilemore expensive.
False Sharing and `@Contended`
When two hot volatile fields share a CPU cache line (typically 64 bytes), writes to either field invalidate the entire cache line for all other CPUs, causing false sharing â a silent performance killer in high-throughput concurrent code.
// â False sharing: counter and flag likely on same cache line
class Shared {
volatile long counter = 0;
volatile boolean flag = false;
}
// â
Use @Contended (JDK internal, requires --add-opens or JVM flag)
// or manual padding
class Padded {
volatile long counter = 0;
long p1, p2, p3, p4, p5, p6, p7; // 56 bytes padding
volatile boolean flag = false;
}
VarHandle (Java 9+)
java.lang.invoke.VarHandle provides fine-grained control over memory ordering semantics without the overhead of full volatile:
| Access mode | Ordering guarantee |
|---|---|
getPlain / setPlain |
No ordering (like non-volatile) |
getOpaque / setOpaque |
Coherent per-variable ordering |
getAcquire / setRelease |
Acquire/release semantics (cheaper than full volatile) |
getVolatile / setVolatile |
Full volatile semantics |
compareAndSet |
CAS with full volatile semantics |
Acquire/release (used extensively in java.util.concurrent) is cheaper than full volatile on weak-memory architectures because it only requires one-directional barriers.
`final` Fields and Safe Publication
The JMM provides a special guarantee for final fields: once a constructor completes and the reference does not escape during construction, all threads will see the correctly initialised values of all final fields without any additional synchronisation. This is the foundation of immutability-based safe publication.
Lock-Free Algorithms and CAS
AtomicInteger, AtomicReference, etc. use compare-and-swap (CAS) â a single atomic CPU instruction (CMPXCHG on x86) that reads, compares, and conditionally writes in one indivisible step. This enables non-blocking algorithms with higher throughput than lock-based alternatives under moderate contention. Under very high contention, CAS retry loops (ABA problem, contended CAS) can degrade to worse performance than a well-tuned lock.
8. Glossary
| Term | Definition |
|---|---|
| JMM | Java Memory Model â the formal specification (JSR-133) defining how threads share memory. |
| Happens-Before | A logical ordering guarantee: if A HB B, all of A's effects are visible to B. |
| Visibility | Whether a write by one thread can be observed by a read in another thread. |
| Atomicity | A property of an operation: it executes as one indivisible unit, with no observable intermediate state. |
| volatile | Java keyword that enforces visibility and prevents reordering, but not mutual exclusion. |
| synchronized | Java keyword providing mutual exclusion and visibility via monitor locks. |
| Memory Barrier | A CPU/compiler instruction that prevents reordering of reads/writes across the barrier. |
| Race Condition | A flaw where the outcome depends on the relative timing of thread execution. |
| Data Race | Two threads access the same memory location concurrently, at least one writes, with no synchronisation. |
| Monitor | The per-object lock mechanism used by synchronized in Java. |
| Safe Publication | Making an object reference visible to other threads in a way that guarantees visibility of its state. |
| Reordering | Compiler or CPU changing the order of memory operations (safe for single-threaded, dangerous for multi-threaded). |
| Store Buffer | CPU hardware buffer that holds pending writes before they reach the cache/main memory. |
| Cache Coherence | Hardware protocol (e.g., MESI) ensuring all CPUs eventually agree on the value of a shared memory location. |
| False Sharing | Two threads unintentionally contending on the same CPU cache line due to proximity of unrelated fields. |
| CAS | Compare-And-Swap â an atomic CPU instruction used to implement lock-free data structures. |
| Acquire/Release | Weaker memory ordering semantics: acquire (after a read) prevents subsequent reads/writes from moving before it; release (before a write) prevents preceding reads/writes from moving after it. |
9. Cheatsheet
- đ HB is the JMM's core rule: establish a happens-before edge or you have no visibility guarantee.
- đ·ïž
volatile= visibility + reordering prevention; NOT mutual exclusion or compound-operation atomicity. - đ
synchronized= mutual exclusion + visibility; use when multiple fields or compound operations must be atomic. - âïž
AtomicInteger/AtomicReference= lock-free, CAS-based atomic operations for single variables. - âŸïž
i++is NEVER atomic, even on avolatilefield â it is read + modify + write. - đïž DCL pattern requires
volatileon the reference field to prevent partially-initialised objects. - đ§
finalfields are freely safe after construction â use immutable objects for the simplest thread safety. - đ False sharing silently kills performance; pad hot volatile fields or use
@Contended. - đ§
VarHandle(Java 9+) offers acquire/release semantics â cheaper than full volatile on weak-memory CPUs. - â ïž Happens-before â wall-clock time: logical ordering, not chronological ordering.
đź Games
8 questions