The batch mechanisms
Most of the typical batch processing is being made through chunk-oriented steps usage, which is implementing a Read-Process-Write repetitive pattern on data.
Typical chunk oriented step behavior¶
A chunk oriented step is made of:
-
An Item Reader
-
An Item Processor (Optional)
-
An Item Writer
The data to be processed is split into chunks whose size can be optionally defined by using item-count
attribute (= chunk size);
Each chunk is holding its own transaction. The transactional behavior of chunk oriented step is demonstrated by the figure below:
Figure 3.1. Chunk oriented step: basic transactional behavior¶
This is basic behavior, when everything runs smoothly and the step completes gracefully. The Read-Process-Write pattern is running within transaction boundaries.
Each time a transaction is committed, the job repository is being updated, in order to guarantee a potential job restart (provided repository update is being done on a persistent storage).
Now, what if something goes wrong?
Any uncaught exception thrown during Read-Process-Write operation will lead to a rollback of current chunk transaction and will cause the step to FAIL and thus consequently the job to FAIL. If job restartability has been set up, the job repository will be updated on transaction rollback so that job can be restarted later.
The figure below illustrates that mechanism
Figure 3.2. Chunk oriented step: when transaction roll-back occurs¶
Caution
When for any reason we leave step (either because it completed or failed because of an unexpected exception), job repository is updated; this update uses a transaction of its own, clearly separated from any chunk-related transaction.
The tasklet approach¶
Using Chunk oriented step is not the only option; one can use a tasklet (aka a batchlet) to cover a whole step. A batchlet is a class that implements Summer.Batch.Core.Step.Tasklet.ITasklet
interface.
The transactional support is guaranteed by Summer.Batch.Core.Step.Tasklet.TaskletStep
class, that is in charge of executing batchlet code (see DoExecute
method, that delegates to DoInTransaction
method, that wraps tasklet code effective execution (Execute
method implementation) ).