osx change file encoding (iconv) recursive
Question
I know I can convert a single file encoding under OSX using:
iconv -f ISO-8859-1 -t UTF-8 myfilename.xxx > myfilename-utf8.xxx
I have to convert a bunch of files with a specific extension, so I want to convert file encoding from ISO-8859-1 to UTF-8 for all *.ext files in folder /mydisk/myfolder
perhaps someobe know the syntax how to do this
thanks
ekke
Solution
Adam' comment showed me the way how to resolve it, but this was the only syntax I made it work:
find /mydisk/myfolder -name \*.xxx -type f | \
(while read file; do
iconv -f ISO-8859-1 -t UTF-8 "$file" > "${file%.xxx}-utf8.xxx";
done);
-i ... -o ... doesnt work, but >
thx again
ekke
OTHER TIPS
if your shell is bash, something like this
for files in /mydisk/myfolder/*.xxx
do
iconv -f ISO-8859-1 -t UTF-8 "$files" "${files%.xxx}-utf8.xxx"
done
Here is example Tested in mac 10.10. Find file by name,convert encode ,then replace original file.work perfect. Thanks for Roman Truba's example,COPY the full code below to your shell script.
#!/bin/bash
find ./ -name *.java -type f | \
(while read file;
do if [[ "$file" != *.DS_Store* ]]; then
if [[ "$file" != *-utf8* ]]; then
iconv -f ISO-8859-1 -t UTF-8 "$file" > "$file-utf8";
rm $file;
echo mv "$file-utf8" "$file";
mv "$file-utf8" "$file";
fi
fi
done);
try this ... it´s tested and workin:
First step (ICONV): find /var/www/ -name *.php -type f | (while read file; do iconv -f ISO-8859-2 -t UTF-8 "$file" > "${file%.php}.phpnew"; done)
Second step (REWRITE - MV):
find /var/www/ -name "*.phpnew" -type f | (while read file; do mv $file echo $file | sed 's/\(.*\.\)phpnew/\1php/'
; done)
It´s just conclusion on my research :)
Hope it helps Jakub Rulec
I extended Albert.Qings script:
- autodetect the current file encoding
- added a command parameter to do a dry/exec-run
added a parameter for the directory and filename pattern
#!/bin/bash command=${1-"usage"} searchPattern=${2-"*.java"} searchDirectory=${3-"."} if [[ "$command" == "usage" ]]; then echo "convert-file-to-utf8.sh [usage|dry|exec] [searchPattern=$searchPattern] [searchDirectory=$searchDirectory]" exit fi find $searchDirectory -type f -name "$searchPattern" | \ (while read file; do if [[ "$file" != *.DS_Store* ]]; then if [[ "$file" != *-utf8* ]]; then currentEncoding="$(file --brief --mime-encoding $file)" if [[ "$currentEncoding" != "utf-8" ]]; then echo "command:$command / iconv -f $currentEncoding -t UTF-8 $file" if [[ "$command" == "exec" ]]; then iconv -f $currentEncoding -t UTF-8 "$file" > "$file-utf8"; rm $file; echo mv "$file-utf8" "$file"; mv "$file-utf8" "$file"; fi fi fi fi done);
Tested on MacOS X 10.12.6 / Sierra.
You could write a script in any scripting language to iterate over every file in /mydisk/myfolder, check the extension with the regex [.(.*)$], and if it's "ext", run the following (or equivalent) from a system call.
"iconv -f ISO-8859-1 -t UTF-8" + file.getName() + ">" + file.getName() + "-utf8.xxx"
This would only be a few lines in Python, but I leave it as an exercise to the reader to go through the specifics of looking up directory iteration and regular expressions.
If you want to do it recursively, you can use find(1)
:
find /mydisk/myfolder -name \*.xxx -type f | \
(while read file; do
iconv -f ISO-8859-1 -t UTF-8 -i "$file" -o "${file%.xxx}-utf8.xxx
done)
Note that I've used | while read
instead of the -exec
option of find (or piping into xargs
) because of the manipulations we need to do with the filename, namely, chopping off the .xxx
extension (using ${file%.xxx}
) and adding -utf8.xxx
.