Persistence Context
entity lifecycle, L1/L2 cache, dirty checking, flush strategies, optimistic locking
Persistence Context
The Persistence Context is the central mechanism of JPA: the EntityManager's identity map that tracks entity states, performs dirty checking, and manages a multi-level cache system.
1. Definition
- What is it? — The Persistence Context (PC) is an in-memory store bound to the EntityManager that keeps track of managed entities. It functions as an identity map: it stores one reference per entity and guarantees repeatable reads.
- Why does it exist? — The PC reduces DB queries (L1 cache), automatically detects changes (dirty checking), and provides consistent entity references within a transaction.
- Where does it fit? — The PC lives inside the EntityManager. In Spring, it typically binds to a transaction (
@Transactional). At the end of the transaction, the PC flushes and closes.
EntityManager
└── Persistence Context (L1 Cache)
├── User#1 → managed instance
├── User#2 → managed instance
└── Order#5 → managed instance
2. Core Concepts
Entity states (lifecycle)
| State | Description | PC aware? | Has ID? |
|---|---|---|---|
| New (Transient) | New object, new User() |
❌ | ❌ (or has one, but PC doesn't know) |
| Managed | After persist() or find() |
✅ | ✅ |
| Detached | After session/transaction close | ❌ | ✅ |
| Removed | After remove(), DELETE on flush |
✅ (marked for deletion) | ✅ |
New ──persist()──→ Managed ──flush()──→ DB INSERT
│
remove() ──→ Removed ──flush()──→ DB DELETE
│
session close ──→ Detached
↑
merge() ←──────── Detached
State transitions in detail
// New → Managed
User user = new User("Alice", "alice@mail.com"); // New
entityManager.persist(user); // Managed (ID generated)
// Managed → DB synchronization
user.setName("Bob"); // dirty checking monitors this
// on flush: UPDATE users SET name='Bob' WHERE id=1
// Managed → Detached
entityManager.detach(user); // or: transaction ends
user.setName("Charlie"); // NOT detected! Not managed.
// Detached → Managed
User mergedUser = entityManager.merge(user); // Managed copy
// Note: user ≠ mergedUser (merge returns a copy)
// Managed → Removed
entityManager.remove(user); // Removed
// on flush: DELETE FROM users WHERE id=1
First-Level Cache (L1)
The L1 cache is part of the Persistence Context, always active and cannot be disabled.
User u1 = em.find(User.class, 1L); // SQL SELECT → DB hit
User u2 = em.find(User.class, 1L); // No SQL! → cache hit
assert u1 == u2; // true — same reference
// Repeatable read guarantee:
// The same entity in the same PC is always represented
// by the same Java object.
The L1 cache is cleared at transaction end. Its size is unlimited — manual clear() is needed for large batch operations.
Dirty checking
Dirty checking is automatic change detection at flush time:
- When an entity is loaded, Hibernate takes a deep copy snapshot
- At flush time, it compares current field values with the snapshot
- If there's a difference → UPDATE SQL generation
- Without
@DynamicUpdate, all columns are included in UPDATE - With
@DynamicUpdate, only changed columns
@Transactional
public void updateUserName(Long id, String newName) {
User user = userRepository.findById(id).orElseThrow();
user.setName(newName); // dirty
// No save() needed — dirty checking auto-UPDATEs on flush
}
Flush strategies
| FlushMode | When it flushes | Usage |
|---|---|---|
| AUTO (default) | Before queries + at transaction commit | ✅ Most cases |
| COMMIT | Only at transaction commit | Performance optimization |
| MANUAL | Only on explicit em.flush() |
Special batch operations |
// AUTO: automatically flushes before queries for consistency
user.setName("Updated");
List<User> users = em.createQuery("SELECT u FROM User u", User.class)
.getResultList();
// ↑ Flush happens before the query so "Updated" name appears in results
3. Practical Usage
When conscious PC management matters
- Batch INSERT/UPDATE (>1000 items): regular
flush()+clear()needed, otherwise OutOfMemoryError - Read-only queries:
@Transactional(readOnly = true)→ Hibernate skips dirty checking snapshot - Detached entity handling: DTO pattern or
merge()usage - Long conversations: Extended Persistence Context or explicit merge
Batch processing pattern
@Transactional
public void batchInsert(List<UserDto> dtos) {
int batchSize = 50;
for (int i = 0; i < dtos.size(); i++) {
User user = new User(dtos.get(i).getName(), dtos.get(i).getEmail());
entityManager.persist(user);
if (i > 0 && i % batchSize == 0) {
entityManager.flush(); // write SQL statements
entityManager.clear(); // clear L1 cache → free memory
}
}
}
Read-only optimization
@Service
public class ReportService {
@Transactional(readOnly = true)
public List<UserSummary> getReport() {
// readOnly = true benefits:
// 1. No dirty checking snapshot → less memory
// 2. Hibernate switches to FLUSH_MODE_MANUAL → no auto-flush
// 3. DB-level read-only transaction hint
return userRepository.findAllProjectedBy();
}
}
Detached entity and the merge() pattern
// Controller → Service → DB flow:
@RestController
public class UserController {
@PutMapping("/users/{id}")
public UserDto updateUser(@PathVariable Long id, @RequestBody UserDto dto) {
return userService.update(id, dto);
}
}
@Service
public class UserService {
@Transactional
public UserDto update(Long id, UserDto dto) {
User user = userRepository.findById(id).orElseThrow();
// ✅ Modifying the managed entity → dirty checking handles it
user.setName(dto.getName());
user.setEmail(dto.getEmail());
// No save() needed — auto-flush at transaction end
return UserDto.from(user);
}
}
4. Code Examples
Second-Level Cache (L2)
The L2 cache is application-level, bound to the SessionFactory. It persists across transactions.
@Entity
@Cacheable
@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Country {
@Id
private String code;
private String name;
}
# application.yml
spring:
jpa:
properties:
hibernate:
cache:
use_second_level_cache: true
region.factory_class: org.hibernate.cache.jcache.JCacheRegionFactory
javax:
cache:
provider: org.ehcache.jsr107.EhcacheCachingProvider
Cache levels summary
| Level | Scope | Default | Content |
|---|---|---|---|
| L1 (PC) | EntityManager / transaction | Always active | Entity references |
| L2 | SessionFactory / application | Manual configuration | Entity snapshots |
| Query Cache | SessionFactory | Manual configuration | Query result IDs |
// L2 cache behavior:
// Transaction A:
Country hu = em.find(Country.class, "HU"); // L1 miss → L2 miss → DB SELECT
// Transaction A ends → L1 cleared, but saved to L2
// Transaction B:
Country hu = em.find(Country.class, "HU"); // L1 miss → L2 HIT! → no DB
Cache concurrency strategies
| Strategy | Consistency | Performance | When |
|---|---|---|---|
| READ_ONLY | ✅ Strong | ✅ Best | Immutable data (Country, Currency) |
| NONSTRICT_READ_WRITE | ⚠️ Eventual | ✅ Good | Rarely changed, not critical |
| READ_WRITE | ✅ Strong | ⚠️ Medium | Frequently read, occasionally written |
| TRANSACTIONAL | ✅ ACID | ❌ Slow | JTA transaction required |
Optimistic Locking (@Version)
@Entity
public class Product {
@Id @GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
@Version
private Integer version;
private String name;
private BigDecimal price;
}
How it works:
- When an entity is loaded, the version value is also loaded (e.g., version=3)
- UPDATE SQL:
UPDATE product SET name=?, price=?, version=4 WHERE id=? AND version=3 - If WHERE affects 0 rows →
OptimisticLockException - The client can retry or merge
@Service
public class ProductService {
@Transactional
public void updatePrice(Long id, BigDecimal newPrice) {
Product product = productRepository.findById(id).orElseThrow();
product.setPrice(newPrice);
// on flush: UPDATE ... WHERE version=? → on conflict: OptimisticLockException
}
}
// Retry pattern for exception handling:
@Retryable(value = OptimisticLockException.class, maxAttempts = 3)
@Transactional
public void updatePriceWithRetry(Long id, BigDecimal newPrice) {
Product product = productRepository.findById(id).orElseThrow();
product.setPrice(newPrice);
}
Pessimistic Locking
// SELECT ... FOR UPDATE
@Lock(LockModeType.PESSIMISTIC_WRITE)
@Query("SELECT p FROM Product p WHERE p.id = :id")
Optional<Product> findByIdWithLock(@Param("id") Long id);
| Locking | Mechanism | When |
|---|---|---|
| Optimistic | @Version + application level |
Low contention (most CRUD) |
| Pessimistic | SELECT FOR UPDATE DB lock |
High contention (inventory, booking) |
5. Trade-offs
| Aspect | Advantage | Disadvantage |
|---|---|---|
| L1 Cache | Automatic, no configuration, repeatable read | Memory issues with large batches |
| Dirty checking | No explicit save/update needed | Hidden UPDATEs, performance overhead |
| L2 Cache | Cross-transaction cache, reduces DB load | Complex cache invalidation, consistency risk |
| Optimistic locking | No DB lock, free reads | Retry needed on conflict |
| Pessimistic locking | Guaranteed exclusive access | Deadlock risk, slow |
| readOnly=true | No snapshot, less memory | Ignores changes on write |
6. Common Mistakes
❌ Unnecessary save() on managed entity
// BAD — unnecessary DB roundtrip
@Transactional
public void updateUser(Long id, String name) {
User user = userRepository.findById(id).orElseThrow();
user.setName(name);
userRepository.save(user); // ← unnecessary! Dirty checking handles it
}
// GOOD — dirty checking auto-UPDATEs
@Transactional
public void updateUser(Long id, String name) {
User user = userRepository.findById(id).orElseThrow();
user.setName(name);
// auto-UPDATE on flush
}
❌ Modifying detached entity without flush
// BAD — changes are lost
public void updateUser(Long id, String name) { // no @Transactional!
User user = userRepository.findById(id).orElseThrow();
user.setName(name);
// Session/PC already closed → no flush → changes lost
}
❌ Batch processing without clear()
// BAD — OutOfMemoryError on large datasets
@Transactional
public void importAll(List<Data> items) {
for (Data d : items) {
entityManager.persist(new Record(d));
// L1 cache grows unbounded → OOM
}
}
❌ L2 cache on frequently changing data
If an entity changes frequently, L2 cache invalidation is more expensive than the database query itself. Use L2 cache for rarely-changing, frequently-read data (e.g., Country, Category, Configuration).
❌ Ignoring @Version in update DTOs
// BAD — frontend doesn't send version → optimistic lock lost
@PutMapping("/products/{id}")
public void update(@RequestBody ProductDto dto) {
Product p = productRepository.findById(dto.getId()).orElseThrow();
p.setName(dto.getName());
// version field not synchronized!
}
// GOOD — DTO includes version
public class ProductDto {
private Long id;
private String name;
private Integer version; // ← frontend sends it back
}
❌ Misunderstanding merge() vs persist()
// persist(): New → Managed (manages the original object)
User user = new User("Alice");
em.persist(user); // user is now managed
// merge(): Detached → Managed COPY (returns a new managed instance!)
User detached = /* ... */;
User managed = em.merge(detached); // managed ≠ detached!
detached.setName("X"); // Does NOT affect DB — still detached
managed.setName("Y"); // This will UPDATE
7. Deep Dive
Hibernate internal: snapshot array
Hibernate uses an Object[] array as a snapshot for dirty checking, not a deep clone. Every managed entity has two arrays: the current state and the loaded state. At flush time, these are compared field by field.
This means:
- For primitives and Strings: simple
equals()comparison - For mutable objects (Date, Collection): deeper comparison
- For large entities (30+ fields): dirty checking overhead is noticeable
@DynamicUpdate optimization
@Entity
@DynamicUpdate // Only UPDATEs changed columns
public class Product {
// 20+ fields...
}
// WITHOUT @DynamicUpdate (default):
// UPDATE product SET name=?, price=?, description=?, ... WHERE id=?
// All columns included even if only name changed
// WITH @DynamicUpdate:
// UPDATE product SET name=? WHERE id=?
// Only changed columns
When to use: entities with many columns where typically 1-2 fields change. Downside: Hibernate can't cache the prepared statement (each UPDATE has different SQL).
Extended Persistence Context
By default, the PC closes at transaction end. Extended PC lives beyond the transaction — typically in @Stateful EJBs or Spring @Scope("session") beans.
// Rarely used in Spring — prefer DTO pattern + merge()
@PersistenceContext(type = PersistenceContextType.EXTENDED)
private EntityManager em;
⚠️ Extended PC problems: memory leaks, stale data, complex lifecycle management. Avoid in most Spring applications.
Query Cache in detail
The Query Cache caches entity IDs, not the entities themselves:
Query: "SELECT u FROM User u WHERE u.status = 'ACTIVE'"
Query Cache: [1, 5, 12, 34] ← entity IDs
After query cache hit → loads entities from L2 cache by ID
When to use:
- Fixed-parameter, frequently-run queries
- Only if affected entities are also in L2 cache
hibernate.cache.use_query_cache=truerequired
Flush and auto-flush behavior
AUTO flush mode ensures pending changes are written before queries:
user.setName("Updated"); // dirty, but no SQL yet
// This JPQL query triggers a flush (because the User table is affected):
List<User> result = em.createQuery("SELECT u FROM User u").getResultList();
// ↑ Before: flush() → UPDATE users SET name='Updated'
// ↑ After: SELECT * FROM users → "Updated" name is visible
// This query does NOT trigger a flush (different table):
List<Order> orders = em.createQuery("SELECT o FROM Order o").getResultList();
// ↑ Does not affect User table → no flush
8. Interview Questions
Q: What is the Persistence Context and how does the L1 cache work? A: The PC is the EntityManager's identity map. The same entity within a single transaction is always represented by the same Java reference. The L1 cache is always active and cleared at transaction end.
Q: When is explicit flush() and clear() needed? A: In batch processing (>1000 items) to prevent unbounded L1 cache growth. Regular flush() writes pending SQL, clear() frees memory.
Q: What's the difference between persist() and merge()? A: persist() transitions New → Managed and operates on the original object. merge() creates a Detached → Managed COPY — the original stays detached and a new managed instance is returned.
Q: How does dirty checking work? A: When an entity is loaded, a snapshot is taken. At flush time, Hibernate compares the current state with the snapshot. If there's a difference, UPDATE SQL is generated. With @DynamicUpdate, only changed columns are included.
Q: What is @Version and how does it handle concurrent modifications? A: @Version is an optimistic locking field. It's included in the UPDATE SQL's WHERE clause. If two transactions modify the same entity, the second gets OptimisticLockException because the version doesn't match.
Q: What's the difference between L1 and L2 cache? A: L1: EntityManager/transaction scope, always active, holds entity references. L2: SessionFactory scope, manually configured, holds entity snapshots, persists across transactions.
Q: Why is @Transactional(readOnly = true) beneficial? A: Hibernate skips the dirty checking snapshot → less memory, no auto-flush, and the DB can optimize for reads (read-only hint).
9. Glossary
| Term | Meaning |
|---|---|
| Persistence Context | EntityManager's identity map, entity references |
| L1 Cache | Same as PC — transaction scope, always active |
| L2 Cache | SessionFactory-scoped, manually configured cache |
| Query Cache | Caching of query result IDs |
| Managed | Entity state: tracked by PC, monitored by dirty checking |
| Detached | Entity state: after session close, has ID, no tracking |
| Dirty checking | Automatic change detection at flush time |
| Flush | Writing pending changes to SQL |
| Snapshot | Deep copy of entity's state at load time |
| @Version | Optimistic locking version field |
| @DynamicUpdate | Only UPDATEs changed columns |
| Extended PC | Persistence Context that lives beyond the transaction |
10. Cheatsheet
ENTITY LIFECYCLE:
New → persist() → Managed
Managed → remove() → Removed → flush() → DELETE
Managed → close() → Detached
Detached → merge() → Managed (COPY!)
persist() ≠ merge() → persist manages original, merge returns copy
L1 CACHE (Persistence Context):
Always active, transaction scope
find() twice → 1 SQL
u1 == u2 → true (identity guaranteed)
Batch: flush() + clear() regularly
DIRTY CHECKING:
Snapshot on load → comparison on flush
@DynamicUpdate → only changed columns
readOnly=true → no snapshot → less memory
FLUSH MODE:
AUTO before queries + at commit (default)
COMMIT only at commit
MANUAL only on explicit em.flush()
L2 CACHE:
@Cacheable + @Cache(usage=...)
READ_ONLY immutable data
READ_WRITE frequently read
SessionFactory scope, requires configuration
LOCKING:
@Version optimistic (no DB lock)
@Lock(PESSIMISTIC) SELECT FOR UPDATE
Optimistic → low contention (CRUD)
Pessimistic → high contention (inventory)
TIPS:
Managed entity → no save() needed
Batch 1000+ → flush/clear in cycles
readOnly=true → for report queries
merge() returns copy, not original!
🎮 Games
10 questions