Question

I would like to run a bash command from Haskell which has unicode file paths.

Strings in Haskell use \escapes e.g

"beißen" -> "bei\223en"

Bash seems to accept the following formats:

$'bei\xC3\x9Fen.avi' and 'beißen.avi'

since runCommand from System.Process has the type

runCommand :: String -> IO System.Process.Internals.ProcessHandle

How do I encode the Haskell String to one of the formats that Bash accepts?

using Mac OSX 10.8.4 which has bash 3.2 .

EDIT

my problem seem to do with bash escaping

I am using Text.ShellEscape (http://hackage.haskell.org/packages/archive/shell-escape/0.1.2/doc/html/Text-ShellEscape.html) to escape the characters that need be escaped for bash

e.g

import qualified Data.ByteString.Char8 as B
import qualified Text.ShellEscape as Esc
let cmd = B.unpack $  Esc.bytes    $  Esc.bash . B.pack $ "beißen.txt"

which gives me "$'bei\\xDFen.txt'"

when running runCommand $ "ls " ++ cmd

it gives me ls: bei�en.txt: No such file or directory

It there a better way to escape strings for bash?

Was it helpful?

Solution

Data.ByteString.Char8 is almost never the right choice if you want to deal with non-ASCII text. It will mangle your data. In your case you probably should use Data.ByteString.UTF8 instead (provided you use a UTF-8 locale, which is the case for most modern desktop Unix-y OSes).

Example of Data.ByteString.Char8 mangling data:

Prelude Data.ByteString.Char8> "été"
"e\769te\769"
Prelude Data.ByteString.Char8> unpack $ pack "été"
"e\SOHte\SOH"
Prelude Data.ByteString.Char8> Prelude.putStrLn "été"
été
Prelude Data.ByteString.Char8> Prelude.putStrLn $ unpack $ pack "été"
ete

Use Data.ByteString.UTF8.toString and not Data.ByteString.Char8.unpack.

These invocations

let s = toString $ bytes $ bash $ fromString "мама.sh"
runCommand s
runCommand $ "ls -l " ++ s

work for me from within ghci ("мама.sh" is a shell script with some Cyrillic characters in the name).

Of course if you escape the entire command it will also escape the white space and it will not work. Escape each word of the command individually.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top