JPA Performance
N+1 problem, fetch strategies, OSIV, batch processing, query optimization
JPA Performance
JPA/Hibernate performance optimization is about fetch strategies, N+1 problem resolution, batch processing, and understanding the OSIV anti-pattern.
1. Definition
- What is it? â JPA performance tuning is the set of techniques and patterns that maximize the efficiency of SQL generated by Hibernate. This includes query optimization, proper fetch strategy configuration, batch processing, and connection pool management.
- Why does it exist? â Hibernate works "magically," but this magic can cause hidden performance problems: too many queries (N+1), oversized result sets (Cartesian product), unnecessary data loading (EAGER). Without conscious optimization, the application will be slow.
- Where does it fit? â Performance tuning sits between development and production monitoring. During development:
spring.jpa.show-sql=true+hibernate.format_sql=true. In production: Hibernate Statistics and slow query monitoring.
2. Core Concepts
LAZY vs EAGER fetch
| Strategy | When loaded | Default for | Risk |
|---|---|---|---|
| EAGER | Immediately with main entity | @ManyToOne, @OneToOne | Unnecessary data loading |
| LAZY | On first access (proxy) | @OneToMany, @ManyToMany | LazyInitializationException |
â ïž Golden rule: Set everything to LAZY, then use JOIN FETCH or @EntityGraph to load what you actually need.
// â EAGER everywhere â Cartesian product, unnecessary data
@ManyToOne(fetch = FetchType.EAGER)
private Customer customer;
// â
LAZY + explicit fetch â you control what loads
@ManyToOne(fetch = FetchType.LAZY)
private Customer customer;
The N+1 problem
N+1 is the most common JPA performance problem:
// 1 query: SELECT * FROM orders
List<Order> orders = orderRepository.findAll();
// N queries: separate SELECT for each order's customer
for (Order o : orders) {
System.out.println(o.getCustomer().getName());
// â SELECT * FROM customers WHERE id = ? (N times!)
}
// Total: 1 + N queries (100 orders â 101 queries!)
Hibernate proxy mechanism
LAZY associations get a proxy (ByteBuddy/CGLIB):
order.getCustomer() â returns a proxy (NOT null, but a proxy!)
â
customer.getName() â proxy intercept
â
session.load(Customer, id) â SQL SELECT
â
proxy fields initialized
If no open session â LazyInitializationException
3. Practical Usage
N+1 solution 1: JOIN FETCH
// â
Loads customer in one query
@Query("SELECT o FROM Order o JOIN FETCH o.customer")
List<Order> findAllWithCustomer();
// â
Multiple associations
@Query("SELECT o FROM Order o JOIN FETCH o.customer JOIN FETCH o.items WHERE o.status = :status")
List<Order> findByStatusWithDetails(@Param("status") OrderStatus status);
â ïž JOIN FETCH limitations:
- Two
@OneToManyJOIN FETCHes in the same query â Cartesian product âMultipleBagFetchException - Solution: change one to
Set, or use two separate queries
N+1 solution 2: @EntityGraph
// â
Declarative, works without @Query
@EntityGraph(attributePaths = {"customer", "items"})
List<Order> findByStatus(OrderStatus status);
// â
Named EntityGraph
@Entity
@NamedEntityGraph(
name = "Order.withCustomerAndItems",
attributeNodes = {
@NamedAttributeNode("customer"),
@NamedAttributeNode("items")
}
)
public class Order { ... }
// Usage:
@EntityGraph("Order.withCustomerAndItems")
List<Order> findByStatus(OrderStatus status);
EntityGraph vs JOIN FETCH:
@EntityGraphâ LEFT JOIN (returns nulls too)JOIN FETCHâ INNER JOIN (only where relationship exists)
N+1 solution 3: @BatchSize
@Entity
public class Order {
@BatchSize(size = 25) // Hibernate-specific
@OneToMany(mappedBy = "order")
private List<OrderItem> items;
}
BatchSize: doesn't load in one query, but uses batched IN queries:
-- Without @BatchSize (N+1):
SELECT * FROM order_items WHERE order_id = 1
SELECT * FROM order_items WHERE order_id = 2
...
-- With @BatchSize(size=25):
SELECT * FROM order_items WHERE order_id IN (1, 2, 3, ..., 25)
SELECT * FROM order_items WHERE order_id IN (26, 27, ..., 50)
N+1 solution 4: DTO Projection
// â
Only needed columns â no entity, no proxy, no N+1
public interface OrderSummary {
Long getId();
String getCustomerName();
BigDecimal getTotalAmount();
}
@Query("SELECT o.id AS id, c.name AS customerName, o.totalAmount AS totalAmount " +
"FROM Order o JOIN o.customer c WHERE o.status = :status")
List<OrderSummary> findSummaryByStatus(@Param("status") OrderStatus status);
DTO projection benefits: less data, no dirty checking overhead, no LAZY problems.
4. Code Examples
JDBC batch insert configuration
# application.yml
spring:
jpa:
properties:
hibernate:
jdbc:
batch_size: 50
order_inserts: true
order_updates: true
generate_statistics: true # for development
datasource:
hikari:
maximum-pool-size: 10
@Transactional
public void batchInsert(List<ProductDto> dtos) {
int batchSize = 50;
for (int i = 0; i < dtos.size(); i++) {
Product p = new Product(dtos.get(i));
entityManager.persist(p);
if (i > 0 && i % batchSize == 0) {
entityManager.flush();
entityManager.clear();
}
}
}
â ïž IDENTITY ID strategy â batch insert DISABLED! Use SEQUENCE with allocationSize.
Correct pagination implementation
// â BAD â fetchAll + filter in memory
List<Order> orders = orderRepository.findAll();
orders.subList(0, 20);
// â
GOOD â DB-level pagination
Page<Order> page = orderRepository.findAll(PageRequest.of(0, 20, Sort.by("createdAt").descending()));
// â
Efficient separate count + data query
@Query(value = "SELECT o FROM Order o JOIN FETCH o.customer",
countQuery = "SELECT COUNT(o) FROM Order o")
Page<Order> findAllWithCustomer(Pageable pageable);
â ïž JOIN FETCH + Pagination â Hibernate paginates in memory! A separate count query is needed.
Hibernate Statistics
@Configuration
public class HibernateStatsConfig {
@Bean
public StatisticsService statisticsService(EntityManagerFactory emf) {
SessionFactory sf = emf.unwrap(SessionFactory.class);
Statistics stats = sf.getStatistics();
stats.setStatisticsEnabled(true);
return new StatisticsService(stats);
}
}
// Logging:
// Queries: stats.getQueryExecutionCount()
// Cache hits: stats.getSecondLevelCacheHitCount()
// Slowest query: stats.getQueryExecutionMaxTimeQueryString()
Native query for large datasets
// When JPA overhead is too much, native SQL + DTO:
@Query(value = """
SELECT o.id, c.name as customer_name, o.total_amount
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.created_at > :since
""", nativeQuery = true)
List<Object[]> findRecentOrdersNative(@Param("since") LocalDateTime since);
// Or: Spring JDBC Template for maximum speed
@Autowired
JdbcTemplate jdbcTemplate;
public List<OrderSummary> findFast() {
return jdbcTemplate.query("SELECT ...", (rs, i) ->
new OrderSummary(rs.getLong("id"), rs.getString("name"))
);
}
5. Trade-offs
| Solution | Advantage | Disadvantage |
|---|---|---|
| JOIN FETCH | One query, efficient | Cartesian product with 2+ collections |
| @EntityGraph | Declarative, clean | LEFT JOIN, not always optimal |
| @BatchSize | Simple, no custom query | Not 1 query, but batched N |
| DTO Projection | Best performance, no dirty checking | More code, no entity features |
| Native SQL | Maximum control | Not portable, no entity management |
| L2 Cache | Dramatic speedup | Complex invalidation |
| OSIV off | Clean architecture | LazyInitializationException risk |
When to use what
| Case | Recommended solution |
|---|---|
| N+1 on a @ManyToOne | JOIN FETCH |
| N+1 on a @OneToMany | @EntityGraph or @BatchSize |
| Read-only list/report | DTO Projection |
| Rarely changing reference data | L2 Cache |
| Bulk data import | JDBC batch + flush/clear |
| Complex analytical query | Native SQL / JdbcTemplate |
6. Common Mistakes
â EAGER fetch everywhere
// BAD â loading Order ALWAYS loads customer, items, tags
@ManyToOne(fetch = FetchType.EAGER)
private Customer customer;
@OneToMany(fetch = FetchType.EAGER)
private List<OrderItem> items;
@ManyToMany(fetch = FetchType.EAGER)
private Set<Tag> tags;
// GOOD â everything LAZY, fetch only where needed
@ManyToOne(fetch = FetchType.LAZY)
private Customer customer;
â Ignoring N+1 queries
Always check Hibernate SQL logs during development:
spring:
jpa:
show-sql: true
properties:
hibernate:
format_sql: true
generate_statistics: true # query count verification
logging:
level:
org.hibernate.SQL: DEBUG
org.hibernate.type.descriptor.sql.BasicBinder: TRACE # parameters
â JOIN FETCH + Pagination
// BAD â Hibernate WARN: HHH90003004: firstResult/maxResults with collection fetch
@Query("SELECT o FROM Order o JOIN FETCH o.items")
Page<Order> findAllWithItems(Pageable pageable);
// â Hibernate paginates IN MEMORY (loads all rows)!
// GOOD â Two-query strategy:
// 1. Query IDs with pagination
// 2. Load entities with JOIN FETCH by IDs
@Query("SELECT o.id FROM Order o")
Page<Long> findOrderIds(Pageable pageable);
@Query("SELECT o FROM Order o JOIN FETCH o.items WHERE o.id IN :ids")
List<Order> findByIdsWithItems(@Param("ids") List<Long> ids);
â Accepting Open Session in View (OSIV)
# Spring Boot default: true (!)
spring:
jpa:
open-in-view: true # â Problematic in production
# GOOD:
spring:
jpa:
open-in-view: false # â
Explicit fetch, clean architecture
OSIV problems:
- Controller layer can trigger LAZY-loading queries â N+1
- DB connection held for the entire HTTP request â connection pool exhaustion
- Hard to debug because queries run silently
â Cartesian product from multiple collection fetches
// BAD â Cartesian product: items Ă tags both loaded
@Query("SELECT o FROM Order o JOIN FETCH o.items JOIN FETCH o.tags")
List<Order> findWithItemsAndTags();
// 10 items and 5 tags â 50 result rows â MultipleBagFetchException (with List)
// GOOD â two separate queries:
@EntityGraph(attributePaths = {"items"})
List<Order> findWithItems();
default List<Order> findWithItemsAndTags() {
List<Order> orders = findWithItems();
// tags LAZY â handled by @BatchSize
return orders;
}
7. Deep Dive
Open Session in View (OSIV) in detail
HTTP Request lifecycle OSIV=true:
ââ Request arrives âââââââââââââââââââââââââââââââââââââââââââ
â OpenEntityManagerInViewInterceptor opens EntityManager â
â ââ @Transactional service method âââââââââââ â
â â DB queries, entity modifications â â
â â Transaction commits â â
â ââââââââââââââââââââââââââââââââââââââââââââââ â
â Controller returns â view rendering â
â LAZY collections still accessible (EM still open!) â
â â N+1 queries from view/controller layer â
â â DB connection still held â
ââ Response sent â EntityManager closes ââââââââââââââââââââââ
With OSIV=false, LazyInitializationException in the controller/view layer signals that explicit fetch is needed. This enforces better design.
Hibernate Proxy and bytecode enhancement
Hibernate 6+ uses ByteBuddy for proxy generation:
Order order = orderRepository.findById(1L).orElseThrow();
Customer customer = order.getCustomer();
// customer.getClass() â Customer$HibernateProxy$xyz
customer.getName(); // â actual SQL runs HERE
// â SELECT * FROM customers WHERE id = ?
Bytecode enhancement alternative:
<!-- pom.xml Hibernate bytecode enhancement plugin -->
<plugin>
<groupId>org.hibernate.orm.tooling</groupId>
<artifactId>hibernate-enhance-maven-plugin</artifactId>
<configuration>
<enableLazyInitialization>true</enableLazyInitialization>
<enableDirtyTracking>true</enableDirtyTracking>
</configuration>
</plugin>
This modifies bytecode at compile-time â no runtime proxy overhead, dirty checking is faster too.
Connection pool and performance
spring:
datasource:
hikari:
maximum-pool-size: 10 # CPU core count * 2 + disk spindle count
minimum-idle: 5
connection-timeout: 30000 # 30s
idle-timeout: 600000 # 10min
max-lifetime: 1800000 # 30min
leak-detection-threshold: 60000 # 60s â can detect OSIV with this!
OSIV=true + slow view rendering â connection held for entire request. 10 connection pool + 100 concurrent requests = bottleneck.
Subselect fetch strategy
@Entity
public class Order {
@Fetch(FetchMode.SUBSELECT) // Hibernate-specific
@OneToMany(mappedBy = "order")
private List<OrderItem> items;
}
-- Uses the original query as a subselect:
SELECT * FROM order_items WHERE order_id IN (
SELECT id FROM orders WHERE status = 'ACTIVE'
)
Subselect vs BatchSize: subselect always resolves in one query, BatchSize resolves in N/batchSize queries.
Second-level cache and performance measurement
// Cache hit rate monitoring
Statistics stats = sessionFactory.getStatistics();
double hitRatio = (double) stats.getSecondLevelCacheHitCount() /
(stats.getSecondLevelCacheHitCount() + stats.getSecondLevelCacheMissCount());
// If hitRatio < 0.8 â cache is not effective, review needed
8. Interview Questions
Q: How do you solve the N+1 problem? A: JOIN FETCH with custom query (best for single associations), @EntityGraph annotation (declarative, works well with Spring Data), @BatchSize (automatic, Hibernate-specific), or DTO projection (best performance).
Q: What is Open Session in View and why is it problematic? A: OSIV keeps the session open for the entire HTTP request. Prevents LazyInitializationException but queries can run from the controller/view layer (N+1) and the DB connection is held for the entire request (connection pool exhaustion).
Q: Why shouldn't you use JOIN FETCH with pagination? A: Hibernate paginates IN MEMORY: loads all rows, then slices the list in Java. Solution: two-query strategy â first IDs with pagination, then entities with JOIN FETCH.
Q: When would you use DTO Projection instead of loading entities? A: For read-only lists, reports, where dirty checking isn't needed and you won't modify entities. Loads less data (only needed columns), no proxy overhead.
Q: What's the difference between JOIN FETCH and @EntityGraph? A: JOIN FETCH uses INNER JOIN (no results where relationship doesn't exist). @EntityGraph uses LEFT JOIN (returns nulls). @EntityGraph is declarative and works without @Query.
Q: Why is IDENTITY ID strategy not batchable and why does it matter? A: IDENTITY uses database auto-increment which only returns the ID after INSERT. Hibernate can't batch â each INSERT is a separate roundtrip. SEQUENCE with allocationSize is batchable.
Q: How would you configure Hibernate batch?
A: hibernate.jdbc.batch_size=50, hibernate.order_inserts=true, hibernate.order_updates=true. In code: flush/clear cycle per batch size. ID strategy: SEQUENCE.
9. Glossary
| Term | Meaning |
|---|---|
| N+1 problem | 1 + N queries due to LAZY associations |
| JOIN FETCH | Explicit JOIN in JPQL to load LAZY associations |
| @EntityGraph | Declarative fetch plan annotation |
| @BatchSize | Batched IN query for LAZY collections |
| DTO Projection | Querying only needed columns, no entity |
| OSIV | Open Session in View â session open during request |
| Cartesian product | Multiple collection fetch â nĂm result rows |
| Proxy | ByteBuddy/CGLIB generated wrapper for LAZY loading |
| LazyInitializationException | LAZY proxy access with closed session |
| Batch insert | Multiple INSERTs in one DB roundtrip |
| Connection pool | Reusable pool of DB connections |
| Hibernate Statistics | Runtime metrics (query count, cache hit) |
10. Cheatsheet
N+1 SOLUTIONS:
JOIN FETCH one query, INNER JOIN
@EntityGraph declarative, LEFT JOIN
@BatchSize(size=25) batched IN query
DTO Projection best performance, no entity
@Fetch(SUBSELECT) original query as subselect
FETCH STRATEGY GOLDEN RULE:
Everything LAZY â explicit fetch what you need
@ManyToOne default EAGER â set to LAZY
@OneToOne default EAGER â set to LAZY
OSIV:
spring.jpa.open-in-view=false â
production
spring.jpa.open-in-view=true â problematic
JOIN FETCH LIMITATIONS:
2+ Collection fetch â Cartesian product
+ Pagination â paginates in memory
Solution: 2-query strategy (IDs + fetch)
BATCH INSERT:
hibernate.jdbc.batch_size=50
hibernate.order_inserts=true
GenerationType.SEQUENCE (not IDENTITY!)
flush() + clear() in cycles
SQL DEBUG:
spring.jpa.show-sql=true
hibernate.format_sql=true
hibernate.generate_statistics=true
org.hibernate.SQL=DEBUG
CONNECTION POOL:
hikari.maximum-pool-size = CPU*2 + disk
hikari.leak-detection-threshold = 60000
OSIV + slow view â connection exhaustion
đź Games
10 questions