Question

I would like to generate a list of files within a directory. Some of the filenames contain Chinese characters.

eg: [试验].Test.txt

I am using the following code:

require 'find'
dirs = ["TestDir"]
for dir in dirs
    Find.find(dir) do |path|
    if FileTest.directory?(path)
    else
        p path
    end
    end
end

Running the script produces a list of files but the Chinese characters are escaped (replaced with backslashes followed by numbers). Using the example filename above would produce:

"TestDir/[\312\324\321\351]Test.txt" instead of "TestDir/[试验].Test.txt".

How can the script be altered to output the Chinese characters?

Was it helpful?

Solution

Ruby needs to know that you are dealing with unicode in your code. Set appropriate character encoding using KCODE, as below:

$KCODE = 'utf-8'

I think utf-8 is good enough for chinese characters.

OTHER TIPS

The following code is more elegant and doesn't require 'find.' It produces a list of files (but not directories) in whatever the working directory is (or whatever directory you put in).

Dir.entries(Dir.pwd).each do |x|
  p x.encode('UTF-8') unless FileTest.directory?(x)  
end 

And to get a recursive digging down one level use:

Dir.glob('*/*').each do |x|
  p x.encode('UTF-8') unless FileTest.directory?(x)  
end

I'm sure there is a way to get it to go all the way down but Dir.glob('**/*') will go through the whole file system if I remember right.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top