Deploying Spring Boot to Kubernetes: A Production-Ready Guide
Getting a Spring Boot app running in Kubernetes isn’t hard. Getting it running well — with proper health checks, resource limits, graceful shutdown, and zero-downtime deployments — takes more thought.
This guide covers the full path from Dockerfile to production-ready Kubernetes deployment. No toy examples. Everything here has been tested in real environments.
The Dockerfile
Start with a multi-stage build that produces a small, secure image:
# Build stage
FROM eclipse-temurin:21-jdk-alpine AS build
WORKDIR /app
COPY pom.xml .
COPY src ./src
RUN ./mvnw package -DskipTests
# Runtime stage
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
# Run as non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
COPY --from=build /app/target/*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Key decisions here:
- Alpine-based image — Smaller attack surface, ~150MB vs ~400MB for the full image
- JRE, not JDK — Runtime doesn’t need the compiler. Saves ~200MB
- Non-root user — Kubernetes security policies often require this. Set it up from the start
- Multi-stage build — Build artifacts don’t end up in the final image
For even smaller images, consider using Spring Boot’s layered jar support:
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
COPY --from=build /app/target/extracted/dependencies/ ./
COPY --from=build /app/target/extracted/spring-boot-loader/ ./
COPY --from=build /app/target/extracted/snapshot-dependencies/ ./
COPY --from=build /app/target/extracted/application/ ./
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]
Layered jars improve Docker cache efficiency. When only your application code changes (not dependencies), Docker rebuilds only the last layer.
Health Checks with Spring Boot Actuator
Kubernetes needs to know if your app is alive and ready to serve traffic. Spring Boot Actuator provides the endpoints.
Add the dependency:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
Configure the health endpoints in application.yml:
management:
endpoints:
web:
exposure:
include: health,info,prometheus
endpoint:
health:
probes:
enabled: true
show-details: always
health:
livenessState:
enabled: true
readinessState:
enabled: true
This gives you three critical endpoints:
/actuator/health/liveness— Is the app running? If this fails, Kubernetes restarts the pod/actuator/health/readiness— Can the app handle requests? If this fails, Kubernetes stops sending traffic/actuator/health— Overall health including database, disk space, etc.
The Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
labels:
app: order-service
spec:
replicas: 3
selector:
matchLabels:
app: order-service
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: your-registry/order-service:1.0.0
ports:
- containerPort: 8080
# Health checks
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 20
periodSeconds: 5
failureThreshold: 3
startupProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30
# Resource limits
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "1000m"
# Environment variables
env:
- name: SPRING_PROFILES_ACTIVE
value: "kubernetes"
- name: JAVA_OPTS
value: "-XX:MaxRAMPercentage=75.0 -XX:+UseG1GC"
envFrom:
- configMapRef:
name: order-service-config
- secretRef:
name: order-service-secrets
Why Each Section Matters
Startup probe — Spring Boot apps take a few seconds to start. Without a startup probe, the liveness probe might kill the pod before it finishes starting. The startup probe gives it up to 150 seconds (30 failures × 5 seconds) to start.
Resource requests vs limits — Requests are what Kubernetes guarantees. Limits are the maximum. Set memory limits to prevent one pod from consuming all node memory. Use MaxRAMPercentage instead of -Xmx so the JVM adapts to the container’s memory limit.
Rolling update strategy — maxUnavailable: 1 means at most one pod can be down during deployment. Combined with 3 replicas, you always have at least 2 pods serving traffic during updates.
ConfigMaps and Secrets
ConfigMap for Non-Sensitive Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: order-service-config
data:
SERVER_PORT: "8080"
SPRING_DATASOURCE_URL: "jdbc:postgresql://postgres:5432/orders"
ORDER_MAX_ITEMS: "500"
LOGGING_LEVEL_ROOT: "INFO"
Secrets for Sensitive Data
apiVersion: v1
kind: Secret
metadata:
name: order-service-secrets
type: Opaque
stringData:
SPRING_DATASOURCE_USERNAME: "order_app"
SPRING_DATASOURCE_PASSWORD: "change-me-in-production"
Spring Boot automatically maps environment variables to properties. SPRING_DATASOURCE_URL becomes spring.datasource.url. No additional configuration needed.
The Service
apiVersion: v1
kind: Service
metadata:
name: order-service
spec:
selector:
app: order-service
ports:
- port: 80
targetPort: 8080
type: ClusterIP
Internal services use ClusterIP. For external access, put an Ingress or LoadBalancer in front.
Graceful Shutdown
When Kubernetes terminates a pod (during scaling or deployment), you want the app to finish processing in-flight requests before shutting down.
In application.yml:
server:
shutdown: graceful
spring:
lifecycle:
timeout-per-shutdown-phase: 30s
In the deployment, set a matching termination grace period:
spec:
terminationGracePeriodSeconds: 45
Set the Kubernetes grace period slightly higher than the Spring shutdown timeout. This gives Spring time to finish in-flight requests, then Kubernetes sends SIGKILL if it hasn’t stopped.
JVM Tuning for Containers
The JVM needs to be container-aware. Modern JVMs (Java 17+) handle this well, but a few settings help:
env:
- name: JAVA_OPTS
value: >-
-XX:MaxRAMPercentage=75.0
-XX:+UseG1GC
-XX:+UseContainerSupport
-Djava.security.egd=file:/dev/./urandom
- MaxRAMPercentage=75.0 — Use 75% of the container’s memory limit for heap. Leave room for metaspace, thread stacks, and native memory.
- UseContainerSupport — Default in modern JVMs, but explicit doesn’t hurt. Makes the JVM aware of cgroup limits.
- urandom — Speeds up startup by using a non-blocking entropy source for cryptographic operations.
Monitoring with Prometheus
Add the Micrometer Prometheus dependency:
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
The /actuator/prometheus endpoint exposes JVM metrics, HTTP request metrics, and custom metrics in Prometheus format. Add a ServiceMonitor for Prometheus to scrape it:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: order-service
spec:
selector:
matchLabels:
app: order-service
endpoints:
- port: http
path: /actuator/prometheus
interval: 15s
Common Mistakes
No resource limits. Without limits, one misbehaving pod can starve other pods on the same node. Always set memory limits.
Using latest tag. image: order-service:latest means you can’t reliably roll back. Use specific version tags or commit SHAs.
Liveness probe too aggressive. If the liveness probe initialDelaySeconds is shorter than your app’s startup time, Kubernetes will keep killing and restarting it in a loop. Use startup probes for slow-starting apps.
Ignoring graceful shutdown. Without it, requests in progress get terminated mid-response during deployments. Users see 502 errors.
Hardcoded configuration. Connection strings, API keys, and feature flags should come from ConfigMaps and Secrets, not application.yml. This lets you change configuration without rebuilding the image.
Frequently Asked Questions
What is the difference between liveness and readiness probes for a Spring Boot app?
Liveness probes tell Kubernetes whether the application is alive—if the probe fails, Kubernetes restarts the container. Readiness probes tell Kubernetes whether the application is ready to serve traffic—if the probe fails, Kubernetes removes the pod from service endpoints but does not restart it. Spring Boot Actuator exposes /actuator/health/liveness and /actuator/health/readiness for each. Keep liveness simple (just a heartbeat) and configure readiness to check dependencies. If liveness checks dependencies and those go down, Kubernetes will restart healthy pods—causing a cascading failure.
How do I pass Spring Boot configuration to a Kubernetes deployment?
Use a ConfigMap for non-sensitive configuration and a Secret for credentials. Mount them as environment variables or as a file. Spring Boot’s externalized configuration picks up environment variables automatically—SPRING_DATASOURCE_URL overrides spring.datasource.url. Never put credentials in ConfigMaps—they’re not encrypted. Use Sealed Secrets, External Secrets Operator, or Vault for secret lifecycle management in a GitOps workflow.
How do I achieve zero-downtime deployments for Spring Boot on Kubernetes?
Configure a Rolling Update strategy, add a readiness probe that only passes when the app is fully started, and enable graceful shutdown (server.shutdown=graceful plus spring.lifecycle.timeout-per-shutdown-phase). When Kubernetes sends SIGTERM, Spring Boot stops accepting new requests but finishes in-flight ones. Set terminationGracePeriodSeconds to at least as long as your shutdown phase timeout. Without graceful shutdown, rolling updates will drop requests during pod termination.
What resource limits should I set for a Spring Boot pod?
Start with requests of 250m CPU and 512Mi memory, limits of 1000m CPU and 1Gi memory for a medium-load service. Set -XX:MaxRAMPercentage=75 so the heap fits within the container limit. Do not set CPU limits too tightly—JVM garbage collection is CPU-intensive and throttling causes latency spikes. Profile your application under realistic load before right-sizing.
How do I use Spring Cloud Kubernetes to read config from ConfigMaps automatically?
Add spring-cloud-starter-kubernetes-client-config. Spring Cloud Kubernetes watches ConfigMaps and Secrets matching your application name and namespace. Enable with spring.config.import=kubernetes:. Set spring.cloud.kubernetes.config.reload.enabled=true for live reload without pod restarts. Ensure your pod’s ServiceAccount has RBAC permissions to read ConfigMaps and Secrets in its namespace.
Quick Reference
| What | Where |
|---|---|
| Dockerfile | Multi-stage, JRE-alpine, non-root user |
| Health checks | /actuator/health/liveness and /actuator/health/readiness |
| Resource requests | 256Mi memory, 250m CPU (adjust based on profiling) |
| Resource limits | 512Mi memory, 1000m CPU (adjust based on profiling) |
| Graceful shutdown | server.shutdown=graceful + matching termination grace period |
| JVM settings | -XX:MaxRAMPercentage=75.0 -XX:+UseG1GC |
| Config | ConfigMaps for non-sensitive, Secrets for sensitive |
| Monitoring | Micrometer + Prometheus + ServiceMonitor |
Start with this setup and adjust based on your application’s actual resource usage. Profile in staging before setting production limits — guessing leads to either wasted resources or OOM kills.