Spring Batch Cheatsheet

1. Core Concepts

Job model

Job
- Container of one or more Steps; defines overall flow and restartability.
- Identified by name + JobParameters → produces a JobInstance.
JobInstance
- Logical run of a Job for a given parameter set (e.g. “EndOfDay for 2026-03-31”).
- A JobInstance can have multiple JobExecutions (re-runs with same params).
JobExecution
- One execution attempt of a JobInstance; holds BatchStatus, ExitStatus, timestamps, failure details.
- Persisted in BATCH_JOB_EXECUTION via JobRepository.
Step
- Single phase of work (tasklet or chunk), executed in sequence/flow within a Job.
- Has its own StepExecution per JobExecution.
StepExecution
- Execution attempt of a Step; has read/write/commit/skip counts, status, exceptions.
- Persisted to BATCH_STEP_EXECUTION.

ExecutionContext

ExecutionContext
- Serializable key–value store scoped to JobExecution or StepExecution.
- Used for restart (e.g. save cursor, page, last processed ID) and cross-step data sharing.
- Persisted by JobRepository; updated at every commit boundary in chunk steps.

Infrastructure components

JobRepository
- Central persistence component for JobInstance, JobExecution, StepExecution, ExecutionContext (CRUD).
- Backed by JDBC or Mongo via @EnableJdbcJobRepository / @EnableMongoJobRepository in recent versions.
JobLauncher
- Starts Jobs: jobLauncher.run(job, jobParameters).
- Retrieves a new JobExecution from JobRepository and triggers Job execution.
JobExplorer
- Read‑only access to batch metadata for querying job/step history, executions, and parameters.
- In newer Spring Batch, JobRepository extends JobExplorer, so separate bean is no longer required.

Batch lifecycle (happy path)

Create JobParameters (from CLI, scheduler, or API).
JobLauncher asks JobRepository for a new JobExecution for Job + parameters.
Job executes Steps according to flow; each Step creates StepExecutions and persists progress to JobRepository.
On completion, JobExecution and StepExecution statuses are updated (COMPLETED, FAILED, etc.).
Restart of failed JobInstance creates a new JobExecution with same parameters; step restartability rules apply.

2. Configuration

Java Config vs XML

Prefer Java Config:
- Type-safety, IDE refactoring, conditional config, shared DSLs.
- Use JobBuilder, StepBuilder directly with injected JobRepository and PlatformTransactionManager in Spring Batch 5+.
XML may still appear in legacy apps; migrate to Java Config gradually around job boundaries, not step‑by‑step inside a job.

@EnableBatchProcessing internals (high level)

@EnableBatchProcessing:
- Bootstraps core infrastructure beans: JobRepository, JobLauncher, transaction manager integration, job/step builders.
- In older versions, implicitly configured JDBC JobRepository; in newer versions, it wires a ResourcelessJobRepository unless you add a store‑specific annotation.
Spring Batch 6+:
- Use @EnableBatchProcessing + @EnableJdbcJobRepository or @EnableMongoJobRepository to configure storage and schema.

Bean scopes

Singleton (default)
- ItemReader/Writer/Processor, services, DAOs, etc.
@JobScope
- Beans created per JobExecution.
- Can inject @Value("#{jobParameters['date']}") or JobExecution context values to parameterize readers/writers.
@StepScope
- Beans created per StepExecution.
- Use for stateful readers/writers that depend on step ExecutionContext or need late‑binding of parameters.

3. Job Design Patterns

Single-step vs multi-step jobs

Single-step
- ETL‑like jobs where all work is homogeneous; easier to reason about.
- Good for stateless transformations and single-source/single-sink pipelines.
Multi-step
- Separate phases: extract → transform → aggregate → export → notify.
- Use when:
  - Different transactional or isolation requirements per phase.
  - Different resources (DB → file → MQ).
  - Need conditional flows based on intermediate results.

Flow, Split, Parallel

Flow
- Directed graph of Steps with transitions based on ExitStatus (e.g. on("FAILED").to(recoveryStep)).
Split
- Execute multiple Flows/Steps in parallel using TaskExecutor.
- Use for independent workloads sharing same Job (e.g. per‑country ETLs).
Parallel processing
- Choose between:
  - Parallel flows (coarse-grained parallelism).
  - Multi-threaded step.
  - Partitioning.
  - Remote chunking / distributed workers.

Deciders

JobExecutionDecider
- Encapsulates branching logic outside steps; returns a FlowExecutionStatus.
- Use for data-driven routing (e.g. skip export if no data, rerun step on specific conditions).

Restartability & idempotency

Design each Step as idempotent per JobInstance:
- Re-running after failure must not corrupt data (e.g. use “upsert” instead of “insert only”, track processed IDs).
Use ExecutionContext to store cursor positions, last processed keys, and checkpoint metadata.
Mark Steps as allowStartIfComplete(true) only if they are safely re‑executable without side effects.

4. Step Types

Tasklet vs Chunk (when to use what)

Aspect	Tasklet	Chunk-oriented
Typical usage	One-shot tasks (cleanup, file move)	High-volume record processing
Data volume	Small to moderate	Large datasets (millions of records)
Transaction mgmt	Manual/explicit	Built-in chunk transaction demarcation
Commit/rollback granularity	Whole step or custom	Per chunk (e.g. 100 items)
Complexity	Simpler for procedural flows	Better for streaming structured data
Restart support	Needs custom logic	Built-in checkpoints via ExecutionCtx

Use Tasklet:
- File housekeeping, triggering stored procedures, sending notifications, pre/post ETL tasks.
Use Chunk:
- Record‑by‑record processing from DB/files/queues with checkpointing and partial commits.

5. Chunk Processing Deep Dive

ItemReader / Processor / Writer

ItemReader<I>
- Provides stream of items; returns null to signal end.
- Must be thread‑safe if used in multi-threaded step.
ItemProcessor<I, O>
- Optional; transforms, validates, enriches items.
- Returning null filters the item out.
ItemWriter<O>
- Writes list of items (one chunk) atomically within transaction boundary.

Commit interval

chunk(size) defines:
- Max items per transaction and per ExecutionContext update.
- Trade‑offs:
  - Larger chunks → fewer transactions, better throughput, more rollback on failure.
  - Smaller chunks → more overhead, less data loss on rollback.
Tune per use case; often 50–1000, depending on DB latency and item cost.

Transactions & rollback

Each chunk is:
- Read → process → write within a single transaction by default.
- On exception:
  - Chunk is rolled back completely.
  - StepExecution and ExecutionContext updated to reflect rollback.
Use stateless readers/writers or ensure internal state is consistent with rollback (e.g. don’t advance external cursors outside transaction).

Skip & Retry logic

Enable with .faultTolerant() on chunk step.
Retry:
- .retry(Exception.class).retryLimit(3) retries item processing/writing within same chunk.
- Use for transient errors (DB deadlocks, network blips).
Skip:
- .skip(SomeBusinessException.class).skipLimit(100) to continue processing despite bad records.
- Configure SkipPolicy for complex decisions.
Prefer:
- Retry for transient, non‑data issues.
- Skip for bad input data; log to error table or DLQ.

6. Readers & Writers

FlatFileItemReader

Use for CSV/TSV/fixed‑width files.
Key configuration:
- resource, lineMapper, linesToSkip, encoding, strict mode.
Production tips:
- Use DefaultLineMapper + DelimitedLineTokenizer/FixedLengthTokenizer.
- Validate input (line length, column count) in processor and send bad lines to error file.

JdbcPagingItemReader / CursorItemReader

JdbcPagingItemReader
- Page‑based queries with ORDER BY key; safe for restarts; avoids holding DB cursor.
- Better for long jobs where DB connections may be interrupted.
JdbcCursorItemReader
- Streaming via DB cursor; fewer queries but long-lived connection.
- Risk of cursor timeout / network issues; carefully tune fetch size and query timeouts.

JpaPagingItemReader

Uses JPA with pagination to fetch entities.
Prefer when existing domain model and JPA mapping already exist.
Avoid N+1 problems:
- Use fetch joins where possible.
- Keep entity graphs minimal.

Messaging / custom readers (Kafka, etc.)

For Kafka or MQ:
- Typically integrate via a custom ItemReader that polls from topic/queue with batch size.
- Ensure:
  - Offset/ack management is integrated with chunk transaction.
  - ExecutionContext stores last committed offset for restart.
Custom readers:
- Implement ItemStream when you need open/update/close hooks and checkpointing.

Writers

FlatFileItemWriter
- For CSV/TSV exports; configure header/footer callbacks for metadata.
JdbcBatchItemWriter
- Batch inserts/updates via JDBC batchUpdate; excellent throughput for DB sinks.
- Use parameterized SQL with ItemPreparedStatementSetter or bean property mapping.
Other writers:
- JPA writers, JMS/Kafka writers, REST writers (via RestTemplate/WebClient) as custom implementations.

7. Performance Tuning

Chunk size tuning

Start with moderate chunk size (e.g. 100–500).
Measure:
- TPS, DB CPU, I/O, rollback blast radius.
Increase until:
- DB becomes bottleneck (locks, contention) or memory spikes.

Parallel steps

Use split flows or multiple Jobs scheduled concurrently for independent workloads.
Ensure:
- No shared mutable state in readers/writers.
- DB indexes support concurrent access patterns.

Partitioning vs multi-threaded steps

Multi-threaded step
- One logical step; multiple threads call read/process/write concurrently.
- Requires thread‑safe reader/writer and careful transactional design.
Partitioning
- Master step divides input domain into partitions; each partition runs its own slave step instance (could be local or remote).
- Easier to reason about as independent mini-jobs, often better for very large datasets.
Rule of thumb:
- Prefer partitioning when data can be cleanly sharded (ID ranges, date buckets, tenants).

Async processing

Asynchronous processors/writers can offload expensive downstream calls:
- E.g. AsyncItemProcessor + AsyncItemWriter pattern.
Use only when ordering doesn’t matter or can be restored.

Database bottlenecks

Create proper indexes on join keys, where clauses, and foreign keys.
Use bulk operations (JdbcBatchItemWriter) instead of per-row operations.
Separate metadata DB (JobRepository) from business data DB when possible to avoid interference.

8. Scaling Patterns

Remote chunking

Master:
- Reads items, groups into chunks, sends to workers over messaging (e.g. MQ/Kafka).
Worker:
- Receives chunk, processes + writes, sends back result.
Use when:
- You need horizontal scaling across machines and central coordination.

Partitioning

Master step builds partitions (e.g. N shards by ID range), each handled by a slave step.
Partitions can run:
- In same JVM (local partitioning).
- On remote nodes (partition handler over messaging/RPC).
Ensure:
- Each partition is idempotent and handles its own shard boundaries.

Horizontal scaling strategies

Run same Job on multiple nodes with:
- Different JobParameters (e.g. region=EMEA/APAC).
- Or as partition workers coordinated by a master.
Use distributed locking around JobRepository (db-level) to avoid duplicate JobExecutions.

9. Error Handling

SkipPolicy, RetryPolicy, fault tolerance

Configure on step via .faultTolerant() with:
- .skip(...), .skipLimit(...)
- .retry(...), .retryLimit(...)
- Custom SkipPolicy or RetryPolicy if DSL not enough.
Attaching listeners:
- SkipListener, RetryListener, ItemReadListener, ItemWriteListener for auditing and side‑effects.

Dead-letter strategies

For bad items:
- Write failed records and error details to:
  - Error table.
  - Error flat file.
  - DLQ topic.
Ensure:
- DLQ operations are within same transaction as skip/rollback to avoid losing context.

10. Transactions

Transaction boundaries in chunk processing

Default:
- One transaction per chunk: all reads (for that chunk), processing, and write participate.
For readers with separate transactions (e.g. non-transactional file reads):
- Still safe since the critical part is DB writes.

Isolation levels

Configure via PlatformTransactionManager (e.g. DataSourceTransactionManager).
Use:
- READ_COMMITTED as default for DB interactions.
- Avoid READ_UNCOMMITTED unless you accept dirty reads.
For heavy batch operations:
- Minimize locking by scanning with appropriate indexes and avoiding long‑running transactions.

Common pitfalls

Performing extra DB writes or side‑effects outside Spring transaction → inconsistencies on rollback.
Keeping in‑memory caches that become stale or inconsistent on retry/skips.
Long‑running transactions with huge chunks causing lock contention and log growth.

11. Monitoring & Observability

JobRepository schema overview

Core tables (JDBC JobRepository):
- BATCH_JOB_INSTANCE: job name + JobParameters identity.
- BATCH_JOB_EXECUTION: executions, statuses, timestamps.
- BATCH_STEP_EXECUTION: per-step execution metrics.
- BATCH_JOB_EXECUTION_PARAMS: stored JobParameters per execution.
- BATCH_*_CONTEXT: ExecutionContext persistence.
Use DBA tools or SQL queries to monitor job states, failures, and durations.

JobExplorer usage

Programmatic queries:
- Find JobInstances by name.
- Get executions for an instance.
- Inspect StepExecutions and ExecutionContext for debugging.
Useful for:
- Operations dashboards.
- Custom admin UIs.

Metrics and logging

Best practices:
- Log Job/Step start+end with parameters and identifiers.
- Include read/write/skip counts and timing.
- Expose metrics via Micrometer (timers, counters per job/step).
- Tag metrics with jobName, stepName, status.

12. Testing

@SpringBatchTest

Provides:
- JobLauncherTestUtils to launch jobs/steps with test parameters.
- JobRepositoryTestUtils to clean up executions between tests.
Typical usage:
- Annotate test with @SpringBatchTest + @SpringBootTest/@SpringJUnitConfig.

@SpringBatchTest
@SpringBootTest
class MyJobTest {

  @Autowired
  private JobLauncherTestUtils jobLauncherTestUtils;

  @Test
  void jobCompletes() throws Exception {
    JobParameters params = jobLauncherTestUtils.getUniqueJobParameters();
    JobExecution exec = jobLauncherTestUtils.launchJob(params);
    assertThat(exec.getExitStatus().getExitCode()).isEqualTo("COMPLETED");
  }
}

Step testing vs Job testing

Step tests
- Use launchStep("stepName", params) to test logic and fault tolerance of a single step.
- Mock or stub ItemReaders/Writers to isolate business logic.
Job tests
- Validate full flow, transitions, and integration of steps and infrastructure.

Mocking readers/writers

Use:
- In-memory implementations (e.g. ListItemReader, ListItemWriter).
- Mockito to stub I/O boundaries when unit-testing processors/services.

13. Production Best Practices

Idempotency

Make Steps idempotent:
- Use business keys; perform upserts or compare‑and‑set instead of blind inserts.
- Track processed items (e.g. processed flag, high‑water mark) to avoid double processing.
Avoid relying solely on JobExecution status; handle partial retries at item level.

Restart strategies

For recoverable failures:
- Restart same JobInstance (same JobParameters) to resume from checkpoints.
For parameter changes:
- New JobInstance with new parameters (e.g. different date range).
Keep:
- JobRepository history long enough for forensic analysis and re-runs.

Data consistency

Ensure external side‑effects (emails, remote calls, MQ sends):
- Are either part of the same transaction (e.g. transactional outbox) or have compensating actions.
Use outbox pattern or event table for cross‑system side-effects.

Versioning jobs

Version job names (e.g. userImportV2) when changing semantics or schema.
Maintain compatibility:
- Keep old job definition for in‑flight or historical restarts.
- Migrate only when repository and data model allow clean cut‑over.

14. Common Pitfalls

Memory issues
- Holding entire result sets in memory (e.g. reading all rows before writing).
- Using ListItemWriter in production instead of streaming writers.
Improper transaction handling
- Doing expensive work outside Spring managed transactions.
- Misconfigured transaction manager (wrong data source, nested transactions).
Restart failures
- Non‑serializable objects in ExecutionContext.
- Readers/writers that do not implement ItemStream but require restart support.
- Business logic relying on transient state that is lost after crash.

15. Code Snippets

Minimal Job configuration (Java Config)

@Configuration
@EnableBatchProcessing
public class BatchConfig {

  @Bean
  public Job sampleJob(JobRepository jobRepository, Step sampleStep) {
    return new JobBuilder("sampleJob", jobRepository)
        .start(sampleStep)
        .build();
  }

  @Bean
  public Step sampleStep(JobRepository jobRepository,
                         PlatformTransactionManager txManager) {
    return new StepBuilder("sampleStep", jobRepository)
        .tasklet((contrib, ctx) -> {
          // do work
          return RepeatStatus.FINISHED;
        }, txManager)
        .build();
  }
}

Chunk step example

@Bean
public Step importUsersStep(JobRepository jobRepository,
                            PlatformTransactionManager txManager,
                            ItemReader<UserRow> reader,
                            ItemProcessor<UserRow, User> processor,
                            ItemWriter<User> writer) {

  return new StepBuilder("importUsers", jobRepository)
      .<UserRow, User>chunk(500, txManager)
      .reader(reader)
      .processor(processor)
      .writer(writer)
      .build();
}

Fault-tolerant step

@Bean
public Step faultTolerantStep(JobRepository jobRepository,
                              PlatformTransactionManager txManager,
                              ItemReader<Input> reader,
                              ItemProcessor<Input, Output> processor,
                              ItemWriter<Output> writer) {

  return new StepBuilder("faultTolerant", jobRepository)
      .<Input, Output>chunk(100, txManager)
      .reader(reader)
      .processor(processor)
      .writer(writer)
      .faultTolerant()
      .retry(TransientDataAccessException.class)
      .retryLimit(3)
      .skip(BadRecordException.class)
      .skipLimit(100)
      .listener(new CustomSkipListener())
      .build();
}

Parallel processing example (split flow)

@Bean
public Job parallelJob(JobRepository jobRepository,
                       Flow flowA,
                       Flow flowB) {
  Flow split = new FlowBuilder<Flow>("splitFlow")
      .split(taskExecutor())
      .add(flowA, flowB)
      .build();

  return new JobBuilder("parallelJob", jobRepository)
      .start(split)
      .end()
      .build();
}

@Bean
public TaskExecutor taskExecutor() {
  ThreadPoolTaskExecutor exec = new ThreadPoolTaskExecutor();
  exec.setCorePoolSize(4);
  exec.setMaxPoolSize(8);
  exec.initialize();
  return exec;
}

Use this sheet as a quick reference during design/debugging: map issues to sections (config, chunking, scaling, transactions, monitoring) and cross-check against your JobRepository metadata and logs when diagnosing production problems.