Question

We are using Spring Batch to read records from CSV file and insert into database table.

Datasource and transaction manager

<!-- connect to database -->
    <bean id="dataSource"
        class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="oracle.jdbc.OracleDriver" />
        <property name="url" value="**********" />
        <property name="username" value="**********" />
        <property name="password" value="**********" />
    </bean>

    <bean id="transactionManagerTest"
        class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />

JOB Configuration

<!-- stored job-meta in database -->
    <bean id="jobRepository"
        class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
        <property name="dataSource" ref="dataSource" />
        <property name="transactionManager" ref="transactionManagerTest"  />
        <property name="databaseType" value="Oracle" />
    </bean>


<bean id="jobLauncher"
        class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository" />
        <property name="taskExecutor"> 
        <bean class="org.springframework.core.task.SimpleAsyncTaskExecutor" />
          </property> 
    </bean>

Below is spring batch job configuration

<batch:job id="reportJob">
        <batch:step id="step1" >
            <batch:tasklet transaction-manager="transactionManagerTest" >
                <batch:chunk reader="cvsFileItemReader" writer="mysqlItemWriter" commit-interval="5" skip-limit="1000" processor-transactional="true">

                    <!-- 
                     <batch:skip-policy>
                        <bean class="org.springframework.batch.core.step.skip.AlwaysSkipItemSkipPolicy" scope="step"/>
                    </batch:skip-policy> -->
                    <batch:skippable-exception-classes>
                        <batch:include class="java.lang.Exception" />
                    </batch:skippable-exception-classes>
                    <!-- <batch:retry-policy>
                        <bean class="org.springframework.retry.policy.NeverRetryPolicy" scope="step"/>
                    </batch:retry-policy> -->
                    <batch:listeners>
                         <batch:listener ref="itemWriterListner"/>
                    </batch:listeners>
                </batch:chunk>
            </batch:tasklet>
        </batch:step>
    </batch:job>

Here we have defined batch:skippable-exception-classes, which should be used to handle if any of record insert statement fails.

Take an example, we have 10 records in csv file, we are reading that one by one and inserting into database table in chuck on 5, but in between, say 4 th record insert fails, it should continue with 5 th records onwards, and should skip only 4 th record.

But with batch:skippable-exception-classes, if 4 th record fails, it again continues from 1 st record. so in database table we have 1-3 records 2 times (duplicate records)

Please suggest, If I am missing any configuration property of spring batch.

Was it helpful?

Solution

There is something wrong with the way you've configured your transaction manager (which isn't included in your configuration above). While bellabax is correct in that when an item being written throws an exception, the entire chunk is rolled back and each item is processed/written individually to determine which item within the chunk caused the error, the key point that doesn't seem to be working for you is the actual rolling back.

UPDATE The ResourcelessTransactionManager isn't a real transaction manager and is not intended for use with transactional resources (like databases). Configure your job with a real transaction manager and you'll be fine.

OTHER TIPS

I believe this is, still, a work in progress. See this JIRA issue and this one. And it's been in discussions for quite some time now: Spring forum post 1 and post 2. Maybe some additional votes on the two JIRA issues would make them more important and more likely to be added.

This is the standard behaviour when an exception occurs during writing phase.

Items (from 1 to 5) are write one-by-one but commited as a single chunk and, if an error occurs, SB is unable to detect which item should be skipped so, how SB can decide which item(s) should be skipped?

SB starts again write phase but write items one-by-one (as setting commit-interval="1") to detect bad item and send it to SkipListener.skipInWrite(item,exception).

About items duplication write using SELECT/UPDATE strategy instead of a simple INSERT.

Define transaction manager as follow

 <bean id="transactionManager"
            class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
             <property name="dataSource" ref="dataSource"/>
    </bean>

This should solve the problem.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top