Question

I know I can convert a single file encoding under OSX using:

iconv -f ISO-8859-1 -t UTF-8 myfilename.xxx > myfilename-utf8.xxx

I have to convert a bunch of files with a specific extension, so I want to convert file encoding from ISO-8859-1 to UTF-8 for all *.ext files in folder /mydisk/myfolder

perhaps someobe know the syntax how to do this

thanks

ekke

Was it helpful?

Solution

Adam' comment showed me the way how to resolve it, but this was the only syntax I made it work:

find /mydisk/myfolder -name \*.xxx -type f | \
    (while read file; do
        iconv -f ISO-8859-1 -t UTF-8 "$file" > "${file%.xxx}-utf8.xxx";
    done);

-i ... -o ... doesnt work, but >

thx again

ekke

OTHER TIPS

if your shell is bash, something like this

for files in /mydisk/myfolder/*.xxx
do
  iconv -f ISO-8859-1 -t UTF-8 "$files" "${files%.xxx}-utf8.xxx"
done

Here is example Tested in mac 10.10. Find file by name,convert encode ,then replace original file.work perfect. Thanks for Roman Truba's example,COPY the full code below to your shell script.

   #!/bin/bash
        find ./ -name *.java -type f | \
        (while read file;
            do if [[ "$file" != *.DS_Store* ]]; then
            if [[ "$file" != *-utf8* ]]; then
                iconv -f ISO-8859-1 -t UTF-8 "$file" > "$file-utf8";
                rm $file;
                echo mv "$file-utf8" "$file";
                mv "$file-utf8" "$file";
            fi
        fi 
        done);

try this ... it´s tested and workin:

First step (ICONV): find /var/www/ -name *.php -type f | (while read file; do iconv -f ISO-8859-2 -t UTF-8 "$file" > "${file%.php}.phpnew"; done)

Second step (REWRITE - MV): find /var/www/ -name "*.phpnew" -type f | (while read file; do mv $file echo $file | sed 's/\(.*\.\)phpnew/\1php/' ; done)

It´s just conclusion on my research :)

Hope it helps Jakub Rulec

I extended Albert.Qings script:

  • autodetect the current file encoding
  • added a command parameter to do a dry/exec-run
  • added a parameter for the directory and filename pattern

    #!/bin/bash
    command=${1-"usage"}
    searchPattern=${2-"*.java"}
    searchDirectory=${3-"."}
    if [[ "$command" == "usage" ]]; then
        echo "convert-file-to-utf8.sh [usage|dry|exec] [searchPattern=$searchPattern] [searchDirectory=$searchDirectory]"
        exit
    fi
    find $searchDirectory -type f -name "$searchPattern" | \
    (while read file;
        do if [[ "$file" != *.DS_Store* ]]; then
        if [[ "$file" != *-utf8* ]]; then
            currentEncoding="$(file --brief --mime-encoding $file)"
            if [[ "$currentEncoding" != "utf-8" ]]; then
               echo "command:$command / iconv -f $currentEncoding -t UTF-8 $file"
               if [[ "$command" == "exec" ]]; then
                 iconv -f $currentEncoding -t UTF-8 "$file" > "$file-utf8";
                 rm $file;
                 echo mv "$file-utf8" "$file";
                 mv "$file-utf8" "$file";
              fi
            fi
        fi
    fi
    done);
    

Tested on MacOS X 10.12.6 / Sierra.

You could write a script in any scripting language to iterate over every file in /mydisk/myfolder, check the extension with the regex [.(.*)$], and if it's "ext", run the following (or equivalent) from a system call.

"iconv -f ISO-8859-1 -t UTF-8" + file.getName() + ">" + file.getName() + "-utf8.xxx"

This would only be a few lines in Python, but I leave it as an exercise to the reader to go through the specifics of looking up directory iteration and regular expressions.

If you want to do it recursively, you can use find(1):

find /mydisk/myfolder -name \*.xxx -type f | \
    (while read file; do
        iconv -f ISO-8859-1 -t UTF-8 -i "$file" -o "${file%.xxx}-utf8.xxx
    done)

Note that I've used | while read instead of the -exec option of find (or piping into xargs) because of the manipulations we need to do with the filename, namely, chopping off the .xxx extension (using ${file%.xxx}) and adding -utf8.xxx.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top