Spring batch Admin : Unable to read 663 MB flat file, OutOfMemoryError
-
30-05-2021 - |
Frage
I am using Spring Batch Admin 1.2.1 in Tomcat 7. I am trying to read 663 MB file and getting following error. I have also increased the heap size of Tomcat but of no vain. The job is pretty straight forward. It reads flat file and save it to DB with very little processing. Please help.
15:05:10,535 INFO http-apr-8181-exec-4 SimpleStepHandler:133 - Executing step: [load]
15:05:10,981 ERROR http-apr-8181-exec-4 AbstractStep:212 - Encountered an error executing the step
java.lang.OutOfMemoryError: Java heap space
at org.apache.catalina.loader.WebappClassLoader.findResourceInternal(WebappClassLoader.java:3098)
at org.apache.catalina.loader.WebappClassLoader.findResource(WebappClassLoader.java:1244)
at org.apache.catalina.loader.WebappClassLoader.getResource(WebappClassLoader.java:1407)
at org.springframework.core.io.ClassPathResource.exists(ClassPathResource.java:139)
at org.springframework.batch.item.file.FlatFileItemReader.doOpen(FlatFileItemReader.java:248)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.open(AbstractItemCountingItemStreamItemReader.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:309)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:131)
at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:119)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at $Proxy30.open(Unknown Source)
at org.springframework.batch.item.support.CompositeItemStream.open(CompositeItemStream.java:93)
at org.springframework.batch.core.step.item.ChunkMonitor.open(ChunkMonitor.java:105)
at org.springframework.batch.item.support.CompositeItemStream.open(CompositeItemStream.java:93)
at org.springframework.batch.core.step.tasklet.TaskletStep.open(TaskletStep.java:301)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:192)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:135)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:61)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:144)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:124)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:281)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:120)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:48)
Following is my setEnv.bat file entry at TOMCAT_HOME\BIN location
set JAVA_OPTS=-DENVIRONMENT=dev -Dlog4j.debug -Xms1024m -Xmx1024m -XX:MaxPermSize=128m %JAVA_OPTS%
Below is my job configs:
<batch:job id="ex_Job" parent="baseJob">
<batch:step id="load" parent="baseStep" next="Recon">
<batch:tasklet>
<batch:chunk reader="Reader" writer="Writer" processor="Processor">
<batch:skippable-exception-classes merge="true" />
</batch:chunk>
<batch:listeners merge="true" />
</batch:tasklet>
</batch:step>
<batch:listeners merge="true" />
</batch:job>
<bean id="Reader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<property name="linesToSkip" value="1" />
<property name="skippedLinesCallback" ref="headerLineCallbackHandler" />
<property name="resource" value="${extract.input.data.dir}/#{jobParameters['input.file']}" />
<property name="lineMapper">
<bean
class="org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper">
<property name="tokenizers">
<map>
<entry key="*" value-ref="LineTokenizer" />
<entry key="T*" value-ref="TrailerTokenizer" />
</map>
</property>
<property name="fieldSetMappers">
<map>
<entry key="*" value-ref="fieldSetMapper" />
</map>
</property>
</bean>
</property>
</bean>
<beans:bean id="fieldSetMapper" class="org.springframework.batch.item.file.mapping.PassThroughFieldSetMapper" />
<job id="baseJob" abstract="true">
<listeners>
<listener ref="skipCheckingJobListener"></listener>
</listeners>
</job>
<step id="baseStep" abstract="true">
<tasklet>
<chunk commit-interval="100" skip-limit="10000">
<skippable-exception-classes>
<include class="org.springframework.batch.item.file.FlatFileParseException" />
<include class="org.springframework.batch.item.file.transform.IncorrectLineLengthException" />
<include class="org.springframework.dao.DataAccessException"/>
<include class="org.springframework.batch.item.file.transform.ConversionException"/>
</skippable-exception-classes>
</chunk>
<listeners>
<listener ref="stepExecutionListener" />
<listener ref="genericSkipListener" />
</listeners>
</tasklet>
</step>
Lösung
I think this is what's happening, but unable to verify myself.
Because your resource was injected as a ClasspathResource, the exists
method attempts to load the entire resource in memory. Instead, try to prepend the resource property with file:
, so your inputReader looks like this.
<property name="resource" value="file:${extract.input.data.dir}/#{jobParameters['input.file']}" />
I've used spring batch to process multi-GB files using this approach before, with no memory issues at all. The only time I did encounter an out-of-memory error was when the file didn't have the correct line endings, and it attempted to read the entire file for the initial line.
Good luck.