Spring Data
repository pattern, CrudRepository, JpaRepository, query derivation, custom queries
Spring Data
Spring Data elevates the repository pattern to framework level: it generates implementations from interface declarations and derives queries from method names.
1. Definition
Spring Data is an umbrella project that provides a unified, repository-based programming model for various data stores (JPA, MongoDB, Redis, Elasticsearch, JDBC, R2DBC). The developer declares an interface, and Spring Data implements CRUD operations and queries at runtime through a proxy.
The central idea: eliminate persistence layer boilerplate by having the framework generate queries from method names, annotations, or Specifications.
Interface declaration → Spring Data proxy → JPA/JDBC/Mongo query
UserRepository SimpleJpaRepository SELECT ...
Spring Boot auto-configures the DataSource, EntityManagerFactory, and transaction manager with the spring-boot-starter-data-jpa starter.
2. Core Concepts
Repository hierarchy
| Interface | Role |
|---|---|
| Repository<T,ID> | Marker interface, no methods |
| CrudRepository<T,ID> | save, findById, findAll, deleteById, count |
| ListCrudRepository<T,ID> | Like CrudRepository but with List<T> return types |
| PagingAndSortingRepository<T,ID> | findAll(Pageable), findAll(Sort) |
| JpaRepository<T,ID> | Flush, batch delete, getReferenceById, JPA-specific |
JpaRepository combines ListCrudRepository + ListPagingAndSortingRepository + JPA-specific methods. Most projects extend this interface.
Query derivation — SQL from method names
Spring Data automatically generates JPQL from method name keywords:
public interface UserRepository extends JpaRepository<User, Long> {
// SELECT u FROM User u WHERE u.email = ?1
Optional<User> findByEmail(String email);
// SELECT u FROM User u WHERE u.lastName = ?1 AND u.active = ?2
List<User> findByLastNameAndActive(String lastName, boolean active);
// SELECT u FROM User u WHERE u.age > ?1 ORDER BY u.lastName ASC
List<User> findByAgeGreaterThanOrderByLastNameAsc(int age);
// SELECT COUNT(u) FROM User u WHERE u.active = ?1
long countByActive(boolean active);
// DELETE FROM User u WHERE u.active = false
void deleteByActiveFalse();
}
Keywords: And, Or, Between, LessThan, GreaterThan, Like, In, IsNull, IsNotNull, OrderBy, Not, True, False, Top, First, Distinct.
@Query — manual JPQL/SQL
@Query("SELECT u FROM User u WHERE u.email LIKE %:domain")
List<User> findByEmailDomain(@Param("domain") String domain);
@Query(value = "SELECT * FROM users WHERE status = :status", nativeQuery = true)
List<User> findByStatusNative(@Param("status") String status);
@Modifying
@Query("UPDATE User u SET u.active = false WHERE u.lastLogin < :date")
int deactivateInactiveUsers(@Param("date") LocalDate date);
Projection — selective field retrieval
// Interface-based (closed) projection
public interface UserSummary {
String getName();
String getEmail();
}
List<UserSummary> findByActive(boolean active);
// DTO-based (class) projection
@Query("SELECT new com.example.dto.UserDto(u.name, u.email) FROM User u")
List<UserDto> findAllAsDto();
3. Practical Usage
Defining a repository
@Repository
public interface OrderRepository extends JpaRepository<Order, UUID> {
List<Order> findByCustomerIdAndStatus(Long customerId, OrderStatus status);
@Query("SELECT o FROM Order o JOIN FETCH o.items WHERE o.id = :id")
Optional<Order> findByIdWithItems(@Param("id") UUID id);
Page<Order> findByStatus(OrderStatus status, Pageable pageable);
}
@Repository is optional for JpaRepository (Spring Boot auto-registers it), but the explicit annotation signals intent and activates exception translation.
Pageable and Sort
@GetMapping("/orders")
public Page<OrderDto> findAll(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") int size,
@RequestParam(defaultValue = "createdAt") String sortBy) {
Pageable pageable = PageRequest.of(page, size, Sort.by(sortBy).descending());
return orderRepository.findByStatus(OrderStatus.ACTIVE, pageable)
.map(OrderDto::from);
}
Page<T> contains the data, the total element count, and pagination metadata.
Custom repository implementation
public interface OrderRepositoryCustom {
List<Order> findByComplexCriteria(OrderSearchCriteria criteria);
}
@Repository
public class OrderRepositoryCustomImpl implements OrderRepositoryCustom {
private final EntityManager em;
public OrderRepositoryCustomImpl(EntityManager em) {
this.em = em;
}
@Override
public List<Order> findByComplexCriteria(OrderSearchCriteria criteria) {
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Order> cq = cb.createQuery(Order.class);
Root<Order> root = cq.from(Order.class);
List<Predicate> predicates = new ArrayList<>();
if (criteria.getStatus() != null) {
predicates.add(cb.equal(root.get("status"), criteria.getStatus()));
}
if (criteria.getMinAmount() != null) {
predicates.add(cb.ge(root.get("totalAmount"), criteria.getMinAmount()));
}
cq.where(predicates.toArray(new Predicate[0]));
return em.createQuery(cq).getResultList();
}
}
// OrderRepository extends both:
public interface OrderRepository
extends JpaRepository<Order, UUID>, OrderRepositoryCustom {}
Specification — dynamic queries
public class OrderSpecs {
public static Specification<Order> hasStatus(OrderStatus status) {
return (root, query, cb) -> cb.equal(root.get("status"), status);
}
public static Specification<Order> createdAfter(LocalDate date) {
return (root, query, cb) -> cb.greaterThan(root.get("createdAt"), date);
}
}
// Usage:
public interface OrderRepository
extends JpaRepository<Order, UUID>, JpaSpecificationExecutor<Order> {}
List<Order> orders = orderRepository.findAll(
OrderSpecs.hasStatus(ACTIVE).and(OrderSpecs.createdAfter(cutoff))
);
4. Code Examples
Auditing — automatic created/modified timestamps
@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
public abstract class BaseEntity {
@CreatedDate
@Column(updatable = false)
private LocalDateTime createdAt;
@LastModifiedDate
private LocalDateTime updatedAt;
@CreatedBy
@Column(updatable = false)
private String createdBy;
@LastModifiedBy
private String updatedBy;
}
@Configuration
@EnableJpaAuditing
public class AuditConfig {
@Bean
public AuditorAware<String> auditorProvider() {
return () -> Optional.ofNullable(
SecurityContextHolder.getContext().getAuthentication())
.map(Authentication::getName);
}
}
QueryByExample — simple dynamic search
User probe = new User();
probe.setActive(true);
probe.setRole("ADMIN");
ExampleMatcher matcher = ExampleMatcher.matching()
.withIgnoreCase()
.withStringMatcher(StringMatcher.CONTAINING);
List<User> admins = userRepository.findAll(Example.of(probe, matcher));
Stream for large dataset processing
@Transactional(readOnly = true)
public void exportAllUsers(Writer writer) {
try (Stream<User> stream = userRepository.streamAllBy()) {
stream.map(UserDto::from)
.forEach(dto -> writeCsv(writer, dto));
}
}
⚠️ Stream<T> requires an open transaction and a database cursor. Always close it with try-with-resources.
Soft delete with Specification
public class SoftDeleteSpec {
public static <T> Specification<T> notDeleted() {
return (root, query, cb) -> cb.isFalse(root.get("deleted"));
}
}
// Every query automatically filters:
List<Order> activeOrders = orderRepository.findAll(
OrderSpecs.hasStatus(ACTIVE).and(SoftDeleteSpec.notDeleted())
);
5. Trade-offs
| Advantage | Disadvantage |
|---|---|
| No boilerplate CRUD code | Generated queries are not always optimal |
| Query derivation for fast prototyping | Long method names become unreadable |
| Unified API (JPA, Mongo, Redis) | Store-specific optimizations are lost |
| Built-in Pageable/Sort | COUNT queries are expensive on large tables |
| Audit, Specification, Projection | Learning curve for advanced features |
When to use Spring Data
- Standard CRUD + queries on a relational database
- Rapid prototyping with a simple domain model
- Unified access across multiple data stores
When NOT to use Spring Data (repository abstraction)
- Complex analytical queries (JOOQ or native SQL is better)
- Extreme performance requirements (JdbcTemplate, raw SQL)
- Non-relational models where the repository pattern does not fit
6. Common Mistakes
❌ Excessively long derived query names
// BAD: unreadable
List<User> findByActiveAndRoleAndCreatedAtAfterAndEmailContaining(
boolean active, String role, LocalDate date, String email);
// GOOD: use @Query or Specification
@Query("SELECT u FROM User u WHERE u.active = :active AND u.role = :role " +
"AND u.createdAt > :date AND u.email LIKE %:email%")
List<User> searchUsers(@Param("active") boolean active,
@Param("role") String role,
@Param("date") LocalDate date,
@Param("email") String email);
❌ Using Page when you don't need total count
// BAD: Page<T> always executes an additional COUNT query
Page<Order> page = orderRepository.findAll(pageable);
// GOOD: use Slice if total count is not needed
Slice<Order> slice = orderRepository.findByStatus(status, pageable);
❌ Missing @Modifying on UPDATE/DELETE @Query
// BAD: throws exception — not a SELECT query
@Query("DELETE FROM User u WHERE u.active = false")
void deleteInactive();
// GOOD: @Modifying signals a write operation
@Modifying
@Query("DELETE FROM User u WHERE u.active = false")
void deleteInactive();
❌ N+1 queries in findAll
// BAD: if Order has a LAZY items collection
List<Order> orders = orderRepository.findAll();
orders.forEach(o -> o.getItems().size()); // N extra queries!
// GOOD: JOIN FETCH with custom query
@Query("SELECT o FROM Order o JOIN FETCH o.items")
List<Order> findAllWithItems();
❌ Unnecessary flush() and saveAndFlush()
JPA automatically flushes at transaction commit. Explicit flush() is only needed when you must see a generated ID or a DB constraint violation immediately.
7. Deep Dive
The proxy mechanism under the hood
- Spring Boot classpath scanning discovers
JpaRepositorysubinterfaces JpaRepositoryFactoryBeancreates aSimpleJpaRepositoryproxy- Query derivation tokenizes the method name and converts it to a
CriteriaQuery @Queryannotations are registered asNamedQueryorNativeQuery- Every call goes through the
SharedEntityManagerCreatorsession
Specification vs QueryDSL vs @Query
| Approach | Type safety | Dynamic | Complexity |
|---|---|---|---|
| Query derivation | Compile-time (name) | No | Low |
| @Query JPQL | None (string) | No | Medium |
| Specification | Predicate-level | Yes | Medium |
| QueryDSL | Q-class level | Yes | Medium |
| Criteria API | Metamodel level | Yes | High |
Spring Data JDBC vs JPA
| Aspect | Spring Data JPA | Spring Data JDBC |
|---|---|---|
| ORM | Hibernate (full ORM) | No ORM, aggregate root |
| Lazy loading | Yes | No |
| Cache | 1st + 2nd level | None |
| Dirty checking | Automatic | None |
| Complexity | High | Low |
Spring Data JDBC is a good choice when you don't need Hibernate complexity but the repository pattern is still useful.
Derived delete gotcha
deleteBy... methods load the entities first, then delete them one by one (cascade, lifecycle hooks). For many rows, @Modifying @Query is more efficient because it runs a single SQL DELETE.
8. Interview Questions
What is the difference between CrudRepository and JpaRepository? CrudRepository: basic CRUD (save, findById, delete, count). JpaRepository: CrudRepository + flush, batch delete, getReferenceById, Pageable/Sort. Most projects use JpaRepository.
How does query derivation work? Spring Data tokenizes the method name (findBy, And, OrderBy, etc.) and generates a JPQL/Criteria query. It validates entity properties at startup.
When do you use @Query instead of derivation? When the method name is too long (3+ conditions), when JOIN FETCH is needed, when native SQL is needed, or when JPQL aggregation is needed (SUM, AVG).
What is Specification and when do you use it? A type-safe, reusable predicate builder for dynamic queries. Best for search forms where conditions are combined at runtime.
What is the difference between Page and Slice? Page: includes totalCount (extra COUNT query). Slice: only indicates if there is a next page. Slice is faster on large tables.
How do you solve the N+1 problem with Spring Data? JOIN FETCH with a custom @Query,
@EntityGraph, or@BatchSizeon the collection.When is Spring Data JDBC better than JPA? When you don't need lazy loading, dirty checking, L2 cache, and prefer a simpler aggregate root model.
9. Glossary
| Term | Meaning |
|---|---|
| Repository | Data access interface implemented by Spring Data proxy |
| Query derivation | Automatic query generation from method names |
| @Query | Manual JPQL or native SQL annotation |
| Projection | Selective field retrieval via interface or DTO |
| Specification | Type-safe, composable predicate builder |
| Pageable | Pagination request (page, size, sort) |
| Page<T> | Pagination response with totalCount |
| Slice<T> | Pagination response without totalCount |
| @Modifying | Marks a @Query as a write operation (UPDATE/DELETE) |
| Auditing | @CreatedDate, @LastModifiedDate automatic timestamps |
| SimpleJpaRepository | Default proxy implementation for JpaRepository |
| EntityGraph | Declarative fetch strategy specification |
10. Cheatsheet
REPOSITORY HIERARCHY:
Repository Marker interface
CrudRepository save, findById, findAll, deleteById, count
JpaRepository + flush, batch, getReferenceById, Pageable
QUERY METHODS:
findByX() Query derivation (from method name)
@Query("JPQL") Manual JPQL
@Query(nativeQuery) Native SQL
Specification Dynamic, composable predicates
QueryByExample Search from probe object
PAGINATION:
Pageable PageRequest.of(page, size, Sort)
Page<T> Data + totalCount (COUNT query)
Slice<T> Data + hasNext (no COUNT)
WRITE QUERIES:
@Modifying UPDATE/DELETE @Query marker
@Modifying(clear) flushAutomatically, clearAutomatically
AUDITING:
@EnableJpaAuditing Configuration activation
@CreatedDate Auto creation timestamp
@LastModifiedDate Auto modification timestamp
AuditorAware<T> User name for audit
CUSTOM REPOSITORY:
XxxRepositoryCustom Interface
XxxRepositoryCustomImpl Implementation (EntityManager)
JpaSpecificationExecutor Specification support
🎮 Games
10 questions