Spring Data JPA is one of those frameworks that is very easy to misuse because the simple path works fine until it does not. You can get a basic CRUD application running in minutes, but the defaults that make that quick start easy are not the right choices for production.
The things that bite teams most often: loading full entities when they only need a couple of fields, not understanding how lazy loading interacts with the persistence context lifetime, letting N+1 queries slip through because they are invisible in development, and writing dynamic query logic as a pile of concatenated strings.
Here is what I have found actually works.
Repository Hierarchy: Pick the Right Interface
Spring Data JPA gives you three repository interfaces:
CrudRepository<T, ID>— save, findById, findAll, delete, countPagingAndSortingRepository<T, ID>— addsfindAll(Pageable)andfindAll(Sort)JpaRepository<T, ID>— addsflush(),saveAndFlush(),deleteInBatch(), and typed finder methods
In application code, extend JpaRepository. The extra methods are worth it, and there is no meaningful downside.
@Repository
public interface OrderRepository extends JpaRepository<Order, Long> {
// Spring Data derives the query from the method name
List<Order> findByCustomerIdAndStatus(Long customerId, OrderStatus status);
// Exists check without loading the entity
boolean existsByEmailAndStatus(String email, OrderStatus status);
// Count without loading entities
long countByStatusAndCreatedAtAfter(OrderStatus status, LocalDateTime since);
}
Method name derivation works well for simple conditions. Once you need joins, OR conditions, or anything involving aggregation, stop—write an explicit query instead. Method names that require reading the documentation to understand are not worth the cleverness.
Custom Queries with @Query
For anything beyond simple field equality, write the JPQL explicitly:
@Repository
public interface OrderRepository extends JpaRepository<Order, Long> {
@Query("""
SELECT o FROM Order o
JOIN FETCH o.customer c
WHERE o.status = :status
AND o.createdAt >= :since
ORDER BY o.createdAt DESC
""")
List<Order> findRecentByStatus(
@Param("status") OrderStatus status,
@Param("since") LocalDateTime since
);
// Native SQL when JPQL cannot express what you need
@Query(
value = "SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = :year",
nativeQuery = true
)
List<Order> findByYear(@Param("year") int year);
// Modifying queries require @Modifying and a transaction
@Modifying
@Transactional
@Query("UPDATE Order o SET o.status = :status WHERE o.id IN :ids")
int updateStatusBatch(@Param("status") OrderStatus status, @Param("ids") List<Long> ids);
}
A few things to note:
JOIN FETCHin the@Querymethod above is intentional—it prevents the N+1 problem forcustomerassociations.- Native queries bypass JPQL’s portability but let you use database-specific syntax, window functions, or CTEs.
@Modifyingis required for any DML statement. Without@Transactional, the update will fail if there is no active transaction.
Projections: Only Fetch What You Need
Loading a full Order entity to display a summary table is wasteful. Every field gets loaded from the database, the entity sits in the persistence context, and Hibernate tracks it for changes you will never make.
Use interface projections for read-only data:
// Spring Data generates the implementation at runtime
public interface OrderSummary {
Long getId();
String getStatus();
BigDecimal getTotal();
String getCustomerName(); // This can map to a nested property
}
@Repository
public interface OrderRepository extends JpaRepository<Order, Long> {
// Spring Data automatically generates a SELECT for the projected fields only
List<OrderSummary> findByCustomerId(Long customerId);
// Works with @Query too—just select the matching columns/aliases
@Query("""
SELECT o.id AS id, o.status AS status, o.total AS total,
c.name AS customerName
FROM Order o JOIN o.customer c
WHERE o.customerId = :customerId
""")
List<OrderSummary> findSummariesByCustomerId(@Param("customerId") Long customerId);
}
For cases where you want a concrete class instead of an interface:
public record OrderSummaryDto(Long id, String status, BigDecimal total) {}
@Repository
public interface OrderRepository extends JpaRepository<Order, Long> {
@Query("""
SELECT new com.example.dto.OrderSummaryDto(o.id, o.status, o.total)
FROM Order o WHERE o.customerId = :customerId
""")
List<OrderSummaryDto> findDtosByCustomerId(@Param("customerId") Long customerId);
}
The new expression in JPQL calls the record/class constructor directly. This avoids the proxy overhead of interface projections and works better with constructor injection patterns.
Pagination: Never Return Unbounded Collections
Any endpoint that lists records needs pagination. A collection that fits in memory today will not when the dataset grows.
@Service
@Transactional(readOnly = true)
public class OrderService {
private final OrderRepository orderRepository;
public Page<OrderSummary> getOrdersByCustomer(Long customerId, int page, int size) {
Pageable pageable = PageRequest.of(page, size, Sort.by("createdAt").descending());
return orderRepository.findByCustomerId(customerId, pageable);
}
}
The repository method signature:
Page<OrderSummary> findByCustomerId(Long customerId, Pageable pageable);
Spring Data automatically adds LIMIT, OFFSET, and a count query. The Page<T> response includes:
getContent()— the records for this pagegetTotalElements()— total record countgetTotalPages()— total page counthasNext()/hasPrevious()— navigation helpers
When Page is too expensive: The count query on large tables is slow. If you only need infinite scroll or “load more” behavior, use Slice<T> instead. It skips the count query and only tells you if a next page exists.
Slice<OrderSummary> findByStatus(OrderStatus status, Pageable pageable);
The N+1 Problem
This is the most common Spring Data JPA performance issue, and it is invisible without query logging.
Here is the problem: you load a list of orders, then access order.getCustomer() on each one. If customer is LAZY (which it should be), Hibernate fires one query per order to fetch the customer. 100 orders = 101 queries.
Enable query logging to see this happening:
# application.yml
spring:
jpa:
properties:
hibernate:
generate_statistics: true
logging:
level:
org.hibernate.SQL: DEBUG
org.hibernate.orm.jdbc.bind: TRACE
Fix 1: JOIN FETCH in the query
@Query("""
SELECT o FROM Order o
JOIN FETCH o.customer
WHERE o.status = :status
""")
List<Order> findByStatusWithCustomer(@Param("status") OrderStatus status);
Fix 2: @EntityGraph on the repository method
@EntityGraph(attributePaths = {"customer", "items"})
List<Order> findByStatus(OrderStatus status);
@EntityGraph is cleaner than JOIN FETCH because it leaves the base query readable and you can apply it selectively. Use it when you have one or two associations to fetch. For more complex graphs, write the JOIN FETCH explicitly.
Fix 3: Projection queries
If you only need fields from the root entity plus a couple from the association, a projection query fetches everything in one SQL SELECT—no lazy loading involved:
@Query("""
SELECT o.id AS id, o.status AS status, c.name AS customerName
FROM Order o JOIN o.customer c WHERE o.status = :status
""")
List<OrderWithCustomerView> findByStatusProjected(@Param("status") OrderStatus status);
Entity Graphs for Controlled Eager Loading
@EntityGraph lets you define which associations to fetch eagerly for a specific query without changing the entity’s default fetch type.
You can define named entity graphs on the entity:
@Entity
@NamedEntityGraph(
name = "Order.withCustomerAndItems",
attributeNodes = {
@NamedAttributeNode("customer"),
@NamedAttributeNode(value = "items", subgraph = "items.product")
},
subgraphs = {
@NamedSubgraph(
name = "items.product",
attributeNodes = @NamedAttributeNode("product")
)
}
)
public class Order {
// ...
}
Then reference by name in the repository:
@EntityGraph("Order.withCustomerAndItems")
Optional<Order> findById(Long id);
Or use the inline syntax from the earlier example:
@EntityGraph(attributePaths = {"customer", "items", "items.product"})
Optional<Order> findWithAllAssociationsById(Long id);
Named entity graphs are useful when you need the same graph in multiple places. Inline graphs are easier for one-off fetch strategies on a single repository method.
Auditing: Timestamps Without Boilerplate
Spring Data JPA auditing automatically populates created/modified timestamps. Add this to a base entity:
@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
public abstract class AuditableEntity {
@CreatedDate
@Column(nullable = false, updatable = false)
private LocalDateTime createdAt;
@LastModifiedDate
@Column(nullable = false)
private LocalDateTime updatedAt;
@CreatedBy
@Column(updatable = false)
private String createdBy;
@LastModifiedBy
private String lastModifiedBy;
}
Enable it in your configuration:
@Configuration
@EnableJpaAuditing(auditorAwareRef = "auditorProvider")
public class JpaConfig {
@Bean
public AuditorAware<String> auditorProvider() {
// Return the current user's identifier from Spring Security context
return () -> Optional.ofNullable(SecurityContextHolder.getContext())
.map(SecurityContext::getAuthentication)
.filter(Authentication::isAuthenticated)
.map(Authentication::getName);
}
}
Extend AuditableEntity from any entity that needs auditing:
@Entity
@Table(name = "orders")
public class Order extends AuditableEntity {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
// ... other fields
}
@CreatedDate and @LastModifiedDate are populated automatically on save. @CreatedBy and @LastModifiedBy require an AuditorAware implementation—that is where you pull the current user from Spring Security or any other source.
Specifications for Dynamic Queries
When query conditions depend on user input at runtime—a search form with optional filters—Specification keeps the logic composable and testable.
Your repository extends JpaSpecificationExecutor:
@Repository
public interface OrderRepository extends JpaRepository<Order, Long>,
JpaSpecificationExecutor<Order> {
}
Define specifications as static factory methods:
public class OrderSpecifications {
public static Specification<Order> hasStatus(OrderStatus status) {
return (root, query, cb) -> status == null
? cb.conjunction()
: cb.equal(root.get("status"), status);
}
public static Specification<Order> createdAfter(LocalDateTime date) {
return (root, query, cb) -> date == null
? cb.conjunction()
: cb.greaterThanOrEqualTo(root.get("createdAt"), date);
}
public static Specification<Order> customerNameContains(String name) {
return (root, query, cb) -> {
if (name == null || name.isBlank()) return cb.conjunction();
Join<Order, Customer> customer = root.join("customer", JoinType.LEFT);
return cb.like(cb.lower(customer.get("name")), "%" + name.toLowerCase() + "%");
};
}
}
Compose them at the call site:
@Service
@Transactional(readOnly = true)
public class OrderService {
public Page<Order> searchOrders(OrderSearchRequest request, Pageable pageable) {
Specification<Order> spec = Specification
.where(hasStatus(request.status()))
.and(createdAfter(request.since()))
.and(customerNameContains(request.customerName()));
return orderRepository.findAll(spec, pageable);
}
}
Each specification handles the null case itself (cb.conjunction() returns always-true). This means you can mix and match without null checks at the composition site.
@Transactional on Service Methods, Not Repositories
A common mistake is annotating repository methods with @Transactional and forgetting the service layer. The service is where the transaction boundary belongs.
@Service
@Transactional // Default: propagation=REQUIRED, readOnly=false
public class OrderService {
@Transactional(readOnly = true) // For read operations—hints optimizer, no dirty check
public Order findById(Long id) {
return orderRepository.findById(id)
.orElseThrow(() -> new EntityNotFoundException("Order not found: " + id));
}
// Write methods use the class-level @Transactional (readOnly=false)
public Order createOrder(CreateOrderRequest request) {
Order order = new Order();
order.setCustomerId(request.customerId());
order.setItems(request.items());
return orderRepository.save(order);
}
}
@Transactional(readOnly = true) matters for two reasons: it tells Hibernate to skip dirty checking at flush time (slightly faster), and it allows the JDBC driver and database to apply read-only optimizations. Use it on any service method that only reads data.
A Note on Lazy Loading and the Open Session in View Pattern
Spring Boot enables spring.jpa.open-in-view=true by default. This keeps the Hibernate session open for the duration of an HTTP request, which means lazy associations can be loaded in templates or serializers after the service method returns.
This is almost always wrong in production. You get N+1 queries you cannot see, the database connection is held open for the full request lifecycle, and the behavior changes in async contexts where the session is not available.
Turn it off:
spring:
jpa:
open-in-view: false
When you do, your app will throw LazyInitializationException wherever associations were silently being loaded by OSIV. That is not a problem with turning off OSIV—those are latent bugs you want to know about. Fix them by using projections, JOIN FETCH, or @EntityGraph at the repository layer instead.
The pattern that covers most production requirements: projections for reads, entities for writes, specifications for dynamic queries, and entity graphs when you need to fetch multiple associations for a specific operation. Measure with query logging before optimizing—N+1 problems are real but not universal, and over-fetching with eager loading can be just as damaging.
Frequently Asked Questions
What is the difference between JpaRepository and CrudRepository in Spring Data JPA?
JpaRepository extends PagingAndSortingRepository which extends CrudRepository. CrudRepository gives you basic CRUD operations (save, findById, findAll, delete). PagingAndSortingRepository adds pagination and sorting. JpaRepository adds JPA-specific methods like flush(), saveAndFlush(), and deleteInBatch(). In practice, most applications extend JpaRepository—there’s rarely a reason to use a more restricted interface unless you’re designing a library API where you want to hide JPA-specific behavior from callers.
How do I fix the N+1 query problem in Spring Data JPA?
The N+1 problem occurs when loading a collection causes one query per entity to fetch a related association. Fix it by using JOIN FETCH in JPQL queries, @EntityGraph on repository methods, or DTO projections that join everything in a single query. Use Hibernate statistics or datasource-proxy to confirm N+1 problems before assuming they exist—don’t optimize prematurely.
When should I use projections instead of entities in Spring Data JPA?
Use projections when you need to read data but not modify it. Returning full entities for read operations loads all columns, keeps entities in the persistence context consuming memory, and risks accidental flushes. Interface-based projections are the most convenient—Spring Data generates the implementation from getter method names. DTO projections (class-based) are slightly more efficient with constructor expressions in JPQL. Use entities only when you need to modify and persist changes.
How do I implement dynamic queries in Spring Data JPA?
Use the Specification API (JPA Criteria API wrapped by Spring Data) for complex dynamic queries. Specifications are composable—combine them with and(), or(), and not() at the call site. Your repository extends JpaSpecificationExecutor<T>. For simpler cases, @Query with conditional JPQL works but becomes unreadable quickly. Avoid Querydsl unless your team is already using it.
How does pagination work in Spring Data JPA?
Pass a Pageable argument to any repository method and Spring Data handles offset and limit. You get back a Page<T> including content, total element count, and navigation helpers. Use PageRequest.of(page, size, Sort.by(...)) to construct the request. For large datasets, prefer Slice<T> over Page<T>—Slice doesn’t run a count query so it’s faster, but it only tells you if a next page exists. Never use findAll() without pagination for collections that can grow.