Pregunta

I need to extract extensions from file names.

I know this can be done for single extensions like .gz or .tar by using filePath.lastIndexOf('.') or using utility methods like FilenameUtils.getExtension(filePath) from Apache commons-io.

But, what if I have a file with an extension like .tar.gz? How can I manage files with extensions that contain . characters?

¿Fue útil?

Solución

If you know what extensions are important, you can simply check for them explicitly. You would have a collection of known extensions, like this:

List<String> EXTS = Arrays.asList("tar.gz", "tgz", "gz", "zip");

You could get the (first) longest matching extension like this:

String getExtension(String fileName) {
  String found = null;
  for (String ext : EXTS) {
    if (fileName.endsWith("." + ext)) {
      if (found == null || found.length() < ext.length()) {
        found = ext;
      }
    }
  }
  return found;
}

So calling getExtension("file.tar.gz") would return "tar.gz".

If you have mixed-case names, perhaps try changing the check to filename.toLowerCase().endsWith("." + ext) inside the loop.

Otros consejos

A file can just have one extension!

If you have a file test.tar.gz,

  • .gz is the extension and
  • test.tar is the Basename!

.tar in this case is part of the basename, not the part of the extension!

If you like to have a file encoded as tar and gz you should call it .tgz. To use a .tar.gz is bad practice, if you need to handle thesse files you should make a workaround like rename the file to test.tgz.

Found a simple way. Use substring to get filename only and indexOf instead of lastIndexOf to get first '.' and extension after it

You can get the filename part of the path, split on . and take the final 0, 1, or 2 elements in the array as the extension.

Of course if .tar.* (gz, bz2, etc.) is your only edge case it may be pragmatic to just build a solution that filters filenames for .tar. and use that as the point at which to extract the extension (to include the .tar portion).

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top