Spring Boot and Spring Batch: Creating a Job Runner for Database Exports

Most batch jobs get triggered on a schedule — a cron at 2am, a CI pipeline step, something like that. But sometimes you need to kick one off on demand: a user clicks “Export”, an admin hits an endpoint, or an upstream service signals that data is ready. This post walks through wiring up a Spring Batch job that reads from a database and writes a CSV, then wrapping it in a REST controller so you can trigger it with a plain HTTP POST.

Project Setup

Start with these dependencies in your pom.xml:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <scope>runtime</scope>
</dependency>

Spring Batch needs its own schema tables to track job executions. With H2 or Postgres, add this to application.properties:

spring.batch.jdbc.initialize-schema=always
spring.batch.job.enabled=false

spring.batch.job.enabled=false is important — it stops Spring Boot from auto-running every registered job on startup, which is almost never what you want when launching jobs via HTTP.

The Domain Object

We’re exporting books. Keep it simple:

public class Book {
    private Long id;
    private String title;
    private String author;

    // getters and setters
}

The Three Batch Components

Reader

JdbcCursorItemReader streams rows directly from the database using a JDBC cursor, which keeps memory usage flat regardless of result set size:

@Bean
public JdbcCursorItemReader<Book> reader(DataSource dataSource) {
    JdbcCursorItemReader<Book> reader = new JdbcCursorItemReader<>();
    reader.setDataSource(dataSource);
    reader.setSql("SELECT id, title, author FROM book");
    reader.setRowMapper(new BeanPropertyRowMapper<>(Book.class));
    return reader;
}

For large tables, prefer JdbcPagingItemReader instead — it issues paginated queries rather than holding an open cursor for the full duration of the job.

Processor

No transformation needed here, so this is a pass-through:

@Bean
public ItemProcessor<Book, Book> processor() {
    return book -> book;
}

In practice this is where you’d filter out records, enrich data from another source, or convert to a different output type.

Writer

FlatFileItemWriter handles the CSV output:

@Bean
public FlatFileItemWriter<Book> writer() {
    FlatFileItemWriter<Book> writer = new FlatFileItemWriter<>();
    writer.setResource(new FileSystemResource("books.csv"));

    DelimitedLineAggregator<Book> aggregator = new DelimitedLineAggregator<>();
    aggregator.setDelimiter(",");

    BeanWrapperFieldExtractor<Book> extractor = new BeanWrapperFieldExtractor<>();
    extractor.setNames(new String[]{"id", "title", "author"});
    aggregator.setFieldExtractor(extractor);

    writer.setLineAggregator(aggregator);
    return writer;
}

If you want a header row, call writer.setHeaderCallback(w -> w.write("id,title,author")) before returning.

Wiring the Job

@Bean
public Step exportStep(JobRepository jobRepository,
                       PlatformTransactionManager transactionManager,
                       JdbcCursorItemReader<Book> reader,
                       ItemProcessor<Book, Book> processor,
                       FlatFileItemWriter<Book> writer) {
    return new StepBuilder("exportStep", jobRepository)
            .<Book, Book>chunk(100, transactionManager)
            .reader(reader)
            .processor(processor)
            .writer(writer)
            .build();
}

@Bean
public Job exportBooksJob(JobRepository jobRepository, Step exportStep) {
    return new JobBuilder("exportBooksJob", jobRepository)
            .start(exportStep)
            .build();
}

The chunk size of 100 means Spring Batch reads 100 records, processes them, writes them, commits the transaction, then loops. Tune this based on record size and available memory.

The HTTP Trigger

JobLauncher is the entry point for starting jobs programmatically. By default it runs synchronously, blocking until the job completes. For short jobs that’s fine; for anything that might take more than a few seconds, configure it to run asynchronously:

@Bean
public JobLauncher asyncJobLauncher(JobRepository jobRepository) throws Exception {
    TaskExecutorJobLauncher launcher = new TaskExecutorJobLauncher();
    launcher.setJobRepository(jobRepository);
    launcher.setTaskExecutor(new SimpleAsyncTaskExecutor());
    launcher.afterPropertiesSet();
    return launcher;
}

Then the controller:

@RestController
@RequestMapping("/jobs")
public class JobController {

    private final JobLauncher jobLauncher;
    private final Job exportBooksJob;
    private final JobExplorer jobExplorer;

    public JobController(JobLauncher jobLauncher, Job exportBooksJob, JobExplorer jobExplorer) {
        this.jobLauncher = jobLauncher;
        this.exportBooksJob = exportBooksJob;
        this.jobExplorer = jobExplorer;
    }

    @PostMapping("/export")
    public ResponseEntity<String> triggerExport() throws Exception {
        JobParameters params = new JobParametersBuilder()
                .addLong("timestamp", System.currentTimeMillis())
                .toJobParameters();

        JobExecution execution = jobLauncher.run(exportBooksJob, params);
        return ResponseEntity.accepted()
                .body("Job started with id: " + execution.getJobId());
    }

    @GetMapping
    public List<String> listJobs() {
        return jobExplorer.getJobNames();
    }

    @GetMapping("/{jobName}/executions")
    public List<JobExecution> getExecutions(@PathVariable String jobName) {
        return jobExplorer.findJobInstancesByJobName(jobName, 0, 20)
                .stream()
                .flatMap(instance -> jobExplorer.getJobExecutions(instance).stream())
                .collect(Collectors.toList());
    }
}

The timestamp job parameter is intentional. Spring Batch treats jobs with identical parameters as the same logical run and won’t restart a completed one. Adding a unique timestamp means each POST creates a genuinely new execution.

Testing It

With the app running, trigger an export:

curl -X POST http://localhost:8080/jobs/export
# Job started with id: 1

curl http://localhost:8080/jobs/exportBooksJob/executions
# Returns list of past executions with status, start time, end time

The books.csv file will appear in your working directory after the job finishes.

What to Watch Out For

File path collisions: The writer above always writes to books.csv, so concurrent runs will clobber each other. In production, use a timestamped filename or include the job execution ID in the path.

Transaction boundaries: The chunk commit interval interacts with your database’s locking behavior. If your reader query is slow and your database uses row-level locks, other processes may be blocked for the duration of the read phase.

Job restartability: By default, a failed Spring Batch job can be restarted from the last successful chunk. If your writer appends to a file, a restart will produce duplicates. Either truncate the file on restart or set writer.setShouldDeleteIfExists(true).


Need Help with Spring Batch Implementation?

Spring Batch can be complex to implement correctly, especially for large-scale data processing. Our Spring Boot experts can help you design efficient batch processing systems that handle millions of records reliably.

Get Spring Batch Consulting →