문제

I encountered the following problem when start running a hama BSP job. This exception occurs when hama tries to load and partition the input data before it actually runs my own code. This is a known problem discussed in some websites but unfortunate without a known cause (eg. see here).

My BSP job works perfectly ok when I only runs part of the data set. However, when I run the full data set, the problem occurs :(

Can I know how to resolve or avoid this problem?

13/11/18 01:19:30 INFO bsp.FileInputFormat: Total input paths to process : 32
13/11/18 01:19:30 INFO bsp.FileInputFormat: Total input paths to process : 32
13/11/18 01:19:30 INFO bsp.BSPJobClient: Running job: job_201311180115_0002
13/11/18 01:19:33 INFO bsp.BSPJobClient: Current supersteps number: 0
13/11/18 01:19:33 INFO bsp.BSPJobClient: Job failed.
13/11/18 01:19:33 ERROR bsp.BSPJobClient: Error partitioning the input path.
java.io.IOException: Runtime partition failed for the job.
    at org.apache.hama.bsp.BSPJobClient.partition(BSPJobClient.java:465)
    at org.apache.hama.bsp.BSPJobClient.submitJobInternal(BSPJobClient.java:333)
    at org.apache.hama.bsp.BSPJobClient.submitJob(BSPJobClient.java:293)
    at org.apache.hama.bsp.BSPJob.submit(BSPJob.java:228)
    at org.apache.hama.bsp.BSPJob.waitForCompletion(BSPJob.java:235)
    at edu.wisc.cs.db.opener.hama.ConnectedEntityBspDriver.main(ConnectedEntityBspDriver.java:183)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hama.util.RunJar.main(RunJar.java:146)
도움이 되었습니까?

해결책

After stuck at this problem for several hours, I found that once the number of input files is greater than the number of allowed bsp tasks, then this error will occur. I think it is probably a bug that Hama should fix in the future.

A quick fix to this problem is to increase the number of maximum bsp tasks, specified by the variable bsp.tasks.maximum in the hama-site.xml file. For example, the following uses 10 instead of the default setting 3:

<property>
  <name>bsp.tasks.maximum</name>
  <value>10</value>
  <description>The maximum number of BSP tasks that will be run simultaneously
  by a groom server.</description>
</property>
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top