Question

When I try to run my Job I am getting the following exception:

Exception in thread "main" java.io.IOException: Mkdirs failed to create /some/path
    at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:106)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:150)

Where the /some/path is hadoop.tmp.dir. However when I issue the dfs -ls cmd on /some/path I can see that it exists and the dataset file is present (was copied before lunching the job). Also the path is correctly defined in hadoop configs. Any suggestions will be appreciated. I am using hadoop 0.21.

Was it helpful?

Solution

This is a file on the local disk that is being created (to unpack your job jar into), not in HDFS. Check you have permissions to mkdir this directory (try it from the command line)

OTHER TIPS

Just ran into this problem running mahout from CDH4 in standalone mode in my MacBook Air.

The issue is that a /tmp/hadoop-xxx/xxx/LICENSE file and a /tmp/hadoop-xxx/xxx/license directory are being created on a case-insensitive file system when unjarring the mahout jobs.

I was able to workaround this by deleting META-INF/LICENSE from the jar file like this:

zip -d mahout-examples-0.6-cdh4.0.0-job.jar META-INF/LICENSE

and then verified it with

jar tvf mahout-examples-0.6-cdh4.0.0-job.jar | grep -i license

Hope this helps!

The problem is OSX specific it is due to the fact that by default the filesystem is set to case-insensitive on a Mac (case preserving but case insensitive, which to my opinion is very bad).

A hack to circumvent this is to create a .dmg disk image with disk utility which is case sensitive and mount this image where you need it (i.e. hadoop.tmp.dir or /tmp) with the following command (as a superuser):

sudo hdiutil attach -mountpoint /tmp <my_image>.dmg

I hope it helps.

I ran into this issues several times in the past, I believe it is a Mac specific issue. Since I use Maven to build my project, I was able to get around it by adding a line in my Maven pom.xml like this:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.0</version>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">
                    </transformer>
                </transformers>
            </configuration>
        </execution>
    </executions>
</plugin>

In my case below lines of code in pom.xml in Maven project worked on Mac.

  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.0</version>
    <configuration>
      <shadedArtifactAttached>true</shadedArtifactAttached>
    </configuration>
    <executions>
      <execution>
        <phase>package</phase>
        <goals>
          <goal>shade</goal>
        </goals>
          <configuration>
            <filters>
              <filter>
                <artifact>*:*</artifact>
                <excludes>
                  <exclude>META-INF/*.SF</exclude>
                  <exclude>META-INF/*.DSA</exclude>
                  <exclude>META-INF/*.RSA</exclude>
                  <exclude>META-INF/LICENSE*</exclude>
                  <exclude>license/*</exclude>
                </excludes>
              </filter>
            </filters>
        </configuration>
      </execution>
    </executions>
  </plugin>

Check the Required space is available or not. THis is problem mostly get because of the space issues.

I ran into this same issue while building MapReduce jobs on a Mac with MacOS Sierra. The same code runs without problems on Ubuntu Linux (14.04 LTS and 16.04 LTS). MapReduce distribution was 2.7.3, and was configured for Single Node, standalone operation. The problem appears to be related to copying license files into a META_INF directory. My problem was solved by adding a transformer into the Maven Shade plugin configuration, specifically: ApacheLicenseResourceTransformer.

Here is the relevant section of the POM.xml, which goes as part of the <build> section:

<plugin>                                                                                                             <groupId>org.apache.maven.plugins</groupId>                                                                      
   <artifactId>maven-shade-plugin</artifactId>                                                                      
   <version>3.0.0</version>                                                                                         
   <executions>                                                                                                     
     <execution>                                                                                                    
       <phase>package</phase>                                                                                       
       <goals>                                                                                                      
         <goal>shade</goal>                                                                                         
       </goals>                                                                                                     
       <configuration>                                                                                              
         <transformers>                                                                                             
           <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">       
             <mainClass>path.to.your.main.class.goes.here</mainClass>                                        
           </transformer>                                                                                           
           <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer">  
           </transformer>                                                                                           
         </transformers>                                                                                            
       </configuration>                                                                                             
     </execution>                                                                                                   
   </executions>                                                                                                    
 </plugin>  

Notice that I also use the ManifestResourceTransformerto specify the main class for the MapReduce Job.

In my case I just renamed the file "log_test.txt"

Because the OS (UBUNTU) was trying to generate a folder with the same name. "log_test.txt/__results.json"

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top