How to override function in Codec.Archive.Tar

Question

You cannot monkey patch or otherwise override a function from a Haskell module, and therefore no workaround will let you avoid the safety measures of the library. What you can do, however, is use the functionality in Codec.Archive.Tar to modify the tar entry paths before unpacking so that they won't be absolute any more. Specifically, there is a mapEntriesNoFail function with type

mapEntriesNoFail :: (Entry -> Entry) -> Entries e -> Entries e

Entries is the type of the argument to Tar.unpack, while Entry is the type of an individual entry. Thanks to mapEntriesNoFail, our problem becomes writing an Entry -> Entry function to adjust the paths. For that, first we will need some extra imports:

import qualified Codec.Archive.Tar.Entry as Tar
import System.FilePath.Posix (takeExtension, dropDrive, hasTrailingPathSeparator)
import Data.Either (either)

The function can look like this:

dropDriveFromEntry :: Tar.Entry -> Tar.Entry
dropDriveFromEntry entry =
    either (error "Resulting tar path is somehow too long")
        (\tp -> entry { Tar.entryTarPath = tp })
        drivelessTarPath
    where
    tarPath = Tar.entryTarPath entry
    path = Tar.fromTarPath tarPath
    toTarPath' p = Tar.toTarPath (hasTrailingPathSeparator p) p
    drivelessTarPath = toTarPath' $ dropDrive path

This may seem a little long-winded; however, the hoops we jump through are there to ensure the resulting tar paths are sane. You can read about the gory details of tar handling on the Codec.Archive.Tar.Entry documentation. The key function in this definition is dropDrive, which makes an absolute path relative (in Linux, it strips the leading slash of an absolute path).

It is worth spending a few words on the use of either. toTarPath produces a value of type Either String TarPath to account for the possibility of failure. Specifically, the conversion to a tar path fails if the provided path is too long. In our case, however, the path cannot be too long, as it is a path which already was in a tar file, perhaps with a removed leading slash. That being so, it is good enough to eliminate the Either wrapping with either, passing an error instead of the function to handle the (impossible) Left case.

With dropDriveFromEntry in hand, we just have to map it over the entries before unpacking. The relevant line of your program would become:

    Tar.unpack searchPath . Tar.mapEntriesNoFail dropDriveFromEntry
        . Tar.read . GZip.decompress =<< BS.readFile filePath

Note that if there were relevant errors to be accounted for in dropDriveFromEntry, we would make it return Either String TarPath, and then use mapEntries instead of mapEntriesNoFail.

With these changes, the entry in your tar file will be extracted to /home/someuser/tarball/dir/tarball_testing/myfile. If that is not what you intended, you can modify dropDriveFromEntry so that it performs whatever extra path processing you need.

P.S.: Regarding the alternate title of your question, and considering the sensible little program you have shown us, I do not think you should be worried :)