سؤال

I have to archive the HDFS files frequently. The files have to be compressed in the Bunzip format using Java code. Now, what I did is the following:

  1. Move the input files to a local location hdfs.moveToLocalFile
  2. bzip using the bzip2 command.
  3. Move the .bz2 files to the HDFS to another locationhdfs.moveFromLocalFile.

I'm using Hadoop 1.1.2 version. Is there any API available to bzip the files directly, without local copy and BZip?

Also now I'm using the linux shell command to BZip the files. Can somebody help me how to do the BZip command using Java code?

هل كانت مفيدة؟

المحلول

public void addFile(String source, String destination, Configuration paramConfiguration) throws IOException, URISyntaxException {
    FileSystem localFileSystem = FileSystem.get(paramConfiguration);
    String str1 = paramString1.substring(source.lastIndexOf('/') + 1, source.length());
    if (destination.charAt(destination.length() - 1) != '/') {
        destination = destination + "/" + str1;
    } else {
        destination = destination + str1;
    }
    BZip2Codec localBZip2Codec = new BZip2Codec();
    String str2 = localBZip2Codec.getDefaultExtension();
    Path localPath = new Path(paramString2 + str2);

    CompressionOutputStream localCompressionOutputStream = localBZip2Codec.createOutputStream(localFileSystem.create(localPath));

    IOUtils.copyBytes(localFileSystem.open(new Path(paramString1)), localCompressionOutputStream, 4096, true);
}
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top