What resource limits should I set for a Spring Boot pod in Kubernetes?

Set both requests and limits for CPU and memory. A typical medium-load Spring Boot service might start with requests of 250m CPU and 512Mi memory, with limits of 1000m CPU and 1Gi memory. Set the JVM's MaxRAMPercentage to 75-80% of your memory limit so the heap fits within the container limit without triggering an OOMKill. Do not set CPU limits too tightly—JVM garbage collection is CPU-intensive and throttling causes latency spikes. Run your application under realistic load to measure actual usage, then right-size from there. Avoid setting requests equal to limits for CPU in most cases.

Deploying Spring Boot to Kubernetes: A Production-Ready Guide

Q: How do I achieve zero-downtime deployments for Spring Boot on Kubernetes?

Configure a Rolling Update deployment strategy (the Kubernetes default), add a readiness probe that only passes when the application is fully started, and configure graceful shutdown in Spring Boot (server.shutdown=graceful plus spring.lifecycle.timeout-per-shutdown-phase to set the drain timeout). When Kubernetes sends SIGTERM, Spring Boot stops accepting new requests but finishes handling in-flight requests before shutting down. Set terminationGracePeriodSeconds in your pod spec to at least as long as your shutdown phase timeout. Without graceful shutdown, rolling updates will drop requests mid-flight during pod termination.

Q: How do I use Spring Cloud Kubernetes to read config from ConfigMaps automatically?

Add spring-cloud-starter-kubernetes-client-config to your dependencies. Spring Cloud Kubernetes watches ConfigMaps and Secrets with labels matching your application name and namespace, and maps them to Spring's Environment as property sources. Enable it with spring.config.import=kubernetes: in your bootstrap.yml or application.yml. This allows live config reload without restarting pods—when you update a ConfigMap, the application picks up the new values automatically (with spring.cloud.kubernetes.config.reload.enabled=true). Ensure your pod's ServiceAccount has RBAC permissions to read ConfigMaps and Secrets in its namespace.

Getting a Spring Boot app running in Kubernetes isn’t hard. Getting it running well — with proper health checks, resource limits, graceful shutdown, and zero-downtime deployments — takes more thought.

This guide covers the full path from Dockerfile to production-ready Kubernetes deployment. No toy examples. Everything here has been tested in real environments.

The Dockerfile

Start with a multi-stage build that produces a small, secure image:

# Build stage
FROM eclipse-temurin:21-jdk-alpine AS build
WORKDIR /app
COPY pom.xml .
COPY src ./src
RUN ./mvnw package -DskipTests

# Runtime stage
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app

# Run as non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

COPY --from=build /app/target/*.jar app.jar

EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]

Key decisions here:

Alpine-based image — Smaller attack surface, ~150MB vs ~400MB for the full image
JRE, not JDK — Runtime doesn’t need the compiler. Saves ~200MB
Non-root user — Kubernetes security policies often require this. Set it up from the start
Multi-stage build — Build artifacts don’t end up in the final image

For even smaller images, consider using Spring Boot’s layered jar support:

FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

COPY --from=build /app/target/extracted/dependencies/ ./
COPY --from=build /app/target/extracted/spring-boot-loader/ ./
COPY --from=build /app/target/extracted/snapshot-dependencies/ ./
COPY --from=build /app/target/extracted/application/ ./

ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]

Layered jars improve Docker cache efficiency. When only your application code changes (not dependencies), Docker rebuilds only the last layer.

Health Checks with Spring Boot Actuator

Kubernetes needs to know if your app is alive and ready to serve traffic. Spring Boot Actuator provides the endpoints.

Add the dependency:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

Configure the health endpoints in application.yml:

management:
  endpoints:
    web:
      exposure:
        include: health,info,prometheus
  endpoint:
    health:
      probes:
        enabled: true
      show-details: always
  health:
    livenessState:
      enabled: true
    readinessState:
      enabled: true

This gives you three critical endpoints:

/actuator/health/liveness — Is the app running? If this fails, Kubernetes restarts the pod
/actuator/health/readiness — Can the app handle requests? If this fails, Kubernetes stops sending traffic
/actuator/health — Overall health including database, disk space, etc.

The Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  labels:
    app: order-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
        - name: order-service
          image: your-registry/order-service:1.0.0
          ports:
            - containerPort: 8080

          # Health checks
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
            failureThreshold: 3

          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 20
            periodSeconds: 5
            failureThreshold: 3

          startupProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 30

          # Resource limits
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "1000m"

          # Environment variables
          env:
            - name: SPRING_PROFILES_ACTIVE
              value: "kubernetes"
            - name: JAVA_OPTS
              value: "-XX:MaxRAMPercentage=75.0 -XX:+UseG1GC"

          envFrom:
            - configMapRef:
                name: order-service-config
            - secretRef:
                name: order-service-secrets

Why Each Section Matters

Startup probe — Spring Boot apps take a few seconds to start. Without a startup probe, the liveness probe might kill the pod before it finishes starting. The startup probe gives it up to 150 seconds (30 failures × 5 seconds) to start.

Resource requests vs limits — Requests are what Kubernetes guarantees. Limits are the maximum. Set memory limits to prevent one pod from consuming all node memory. Use MaxRAMPercentage instead of -Xmx so the JVM adapts to the container’s memory limit.

Rolling update strategy — maxUnavailable: 1 means at most one pod can be down during deployment. Combined with 3 replicas, you always have at least 2 pods serving traffic during updates.

ConfigMaps and Secrets

ConfigMap for Non-Sensitive Configuration

apiVersion: v1
kind: ConfigMap
metadata:
  name: order-service-config
data:
  SERVER_PORT: "8080"
  SPRING_DATASOURCE_URL: "jdbc:postgresql://postgres:5432/orders"
  ORDER_MAX_ITEMS: "500"
  LOGGING_LEVEL_ROOT: "INFO"

Secrets for Sensitive Data

apiVersion: v1
kind: Secret
metadata:
  name: order-service-secrets
type: Opaque
stringData:
  SPRING_DATASOURCE_USERNAME: "order_app"
  SPRING_DATASOURCE_PASSWORD: "change-me-in-production"

Spring Boot automatically maps environment variables to properties. SPRING_DATASOURCE_URL becomes spring.datasource.url. No additional configuration needed.

The Service

apiVersion: v1
kind: Service
metadata:
  name: order-service
spec:
  selector:
    app: order-service
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

Internal services use ClusterIP. For external access, put an Ingress or LoadBalancer in front.

Graceful Shutdown

When Kubernetes terminates a pod (during scaling or deployment), you want the app to finish processing in-flight requests before shutting down.

In application.yml:

server:
  shutdown: graceful

spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s

In the deployment, set a matching termination grace period:

spec:
  terminationGracePeriodSeconds: 45

Set the Kubernetes grace period slightly higher than the Spring shutdown timeout. This gives Spring time to finish in-flight requests, then Kubernetes sends SIGKILL if it hasn’t stopped.

JVM Tuning for Containers

The JVM needs to be container-aware. Modern JVMs (Java 17+) handle this well, but a few settings help:

env:
  - name: JAVA_OPTS
    value: >-
      -XX:MaxRAMPercentage=75.0
      -XX:+UseG1GC
      -XX:+UseContainerSupport
      -Djava.security.egd=file:/dev/./urandom

MaxRAMPercentage=75.0 — Use 75% of the container’s memory limit for heap. Leave room for metaspace, thread stacks, and native memory.
UseContainerSupport — Default in modern JVMs, but explicit doesn’t hurt. Makes the JVM aware of cgroup limits.
urandom — Speeds up startup by using a non-blocking entropy source for cryptographic operations.

Monitoring with Prometheus

Add the Micrometer Prometheus dependency:

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

The /actuator/prometheus endpoint exposes JVM metrics, HTTP request metrics, and custom metrics in Prometheus format. Add a ServiceMonitor for Prometheus to scrape it:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: order-service
spec:
  selector:
    matchLabels:
      app: order-service
  endpoints:
    - port: http
      path: /actuator/prometheus
      interval: 15s

Common Mistakes

No resource limits. Without limits, one misbehaving pod can starve other pods on the same node. Always set memory limits.

Using latest tag. image: order-service:latest means you can’t reliably roll back. Use specific version tags or commit SHAs.

Liveness probe too aggressive. If the liveness probe initialDelaySeconds is shorter than your app’s startup time, Kubernetes will keep killing and restarting it in a loop. Use startup probes for slow-starting apps.

Ignoring graceful shutdown. Without it, requests in progress get terminated mid-response during deployments. Users see 502 errors.

Hardcoded configuration. Connection strings, API keys, and feature flags should come from ConfigMaps and Secrets, not application.yml. This lets you change configuration without rebuilding the image.

Frequently Asked Questions

What is the difference between liveness and readiness probes for a Spring Boot app?

Liveness probes tell Kubernetes whether the application is alive—if the probe fails, Kubernetes restarts the container. Readiness probes tell Kubernetes whether the application is ready to serve traffic—if the probe fails, Kubernetes removes the pod from service endpoints but does not restart it. Spring Boot Actuator exposes /actuator/health/liveness and /actuator/health/readiness for each. Keep liveness simple (just a heartbeat) and configure readiness to check dependencies. If liveness checks dependencies and those go down, Kubernetes will restart healthy pods—causing a cascading failure.

How do I pass Spring Boot configuration to a Kubernetes deployment?

Use a ConfigMap for non-sensitive configuration and a Secret for credentials. Mount them as environment variables or as a file. Spring Boot’s externalized configuration picks up environment variables automatically—SPRING_DATASOURCE_URL overrides spring.datasource.url. Never put credentials in ConfigMaps—they’re not encrypted. Use Sealed Secrets, External Secrets Operator, or Vault for secret lifecycle management in a GitOps workflow.

How do I achieve zero-downtime deployments for Spring Boot on Kubernetes?

Configure a Rolling Update strategy, add a readiness probe that only passes when the app is fully started, and enable graceful shutdown (server.shutdown=graceful plus spring.lifecycle.timeout-per-shutdown-phase). When Kubernetes sends SIGTERM, Spring Boot stops accepting new requests but finishes in-flight ones. Set terminationGracePeriodSeconds to at least as long as your shutdown phase timeout. Without graceful shutdown, rolling updates will drop requests during pod termination.

What resource limits should I set for a Spring Boot pod?

Start with requests of 250m CPU and 512Mi memory, limits of 1000m CPU and 1Gi memory for a medium-load service. Set -XX:MaxRAMPercentage=75 so the heap fits within the container limit. Do not set CPU limits too tightly—JVM garbage collection is CPU-intensive and throttling causes latency spikes. Profile your application under realistic load before right-sizing.

How do I use Spring Cloud Kubernetes to read config from ConfigMaps automatically?

Add spring-cloud-starter-kubernetes-client-config. Spring Cloud Kubernetes watches ConfigMaps and Secrets matching your application name and namespace. Enable with spring.config.import=kubernetes:. Set spring.cloud.kubernetes.config.reload.enabled=true for live reload without pod restarts. Ensure your pod’s ServiceAccount has RBAC permissions to read ConfigMaps and Secrets in its namespace.

Quick Reference

What	Where
Dockerfile	Multi-stage, JRE-alpine, non-root user
Health checks	`/actuator/health/liveness` and `/actuator/health/readiness`
Resource requests	256Mi memory, 250m CPU (adjust based on profiling)
Resource limits	512Mi memory, 1000m CPU (adjust based on profiling)
Graceful shutdown	`server.shutdown=graceful` + matching termination grace period
JVM settings	`-XX:MaxRAMPercentage=75.0 -XX:+UseG1GC`
Config	ConfigMaps for non-sensitive, Secrets for sensitive
Monitoring	Micrometer + Prometheus + ServiceMonitor

Start with this setup and adjust based on your application’s actual resource usage. Profile in staging before setting production limits — guessing leads to either wasted resources or OOM kills.

If you haven’t containerized your app yet, start with our guide on Dockerizing Spring Boot — it covers multi-stage builds and layered jars that feed directly into this Kubernetes setup. Once you’re running in production, you’ll want proper observability; our Spring Boot monitoring with Actuator and Prometheus guide covers the metrics and alerting side.

Java Modernization Readiness Assessment

15 questions your team should answer before starting a migration. Takes 10 minutes. Could save you months.