문제

Lets say I have three files in a folder: file9.txt, file10.txt and file11.txt and i want to read them in this particular order. Can anyone help me with this?

Right now I am using the code

import glob, os
for infile in glob.glob(os.path.join( '*.txt')):
    print "Current File Being Processed is: " + infile

and it reads first file10.txt then file11.txt and then file9.txt.

Can someone help me how to get the right order?

도움이 되었습니까?

해결책

Files on the filesystem are not sorted. You can sort the resulting filenames yourself using the sorted() function:

for infile in sorted(glob.glob('*.txt')):
    print "Current File Being Processed is: " + infile

Note that the os.path.join call in your code is a no-op; with only one argument it doesn't do anything but return that argument unaltered.

Note that your files will sort in alphabetical ordering, which puts 10 before 9. You can use a custom key function to improve the sorting:

import re
numbers = re.compile(r'(\d+)')
def numericalSort(value):
    parts = numbers.split(value)
    parts[1::2] = map(int, parts[1::2])
    return parts

 for infile in sorted(glob.glob('*.txt'), key=numericalSort):
    print "Current File Being Processed is: " + infile

The numericalSort function splits out any digits in a filename, turns it into an actual number, and returns the result for sorting:

>>> files = ['file9.txt', 'file10.txt', 'file11.txt', '32foo9.txt', '32foo10.txt']
>>> sorted(files)
['32foo10.txt', '32foo9.txt', 'file10.txt', 'file11.txt', 'file9.txt']
>>> sorted(files, key=numericalSort)
['32foo9.txt', '32foo10.txt', 'file9.txt', 'file10.txt', 'file11.txt']

다른 팁

You can wrap your glob.glob( ... ) expression inside a sorted( ... ) statement and sort the resulting list of files. Example:

for infile in sorted(glob.glob('*.txt')):

You can give sorted a comparison function or, better, use the key= ... argument to give it a custom key that is used for sorting.

Example:

There are the following files:

x/blub01.txt
x/blub02.txt
x/blub10.txt
x/blub03.txt
y/blub05.txt

The following code will produce the following output:

for filename in sorted(glob.glob('[xy]/*.txt')):
        print filename
# x/blub01.txt
# x/blub02.txt
# x/blub03.txt
# x/blub10.txt
# y/blub05.txt

Now with key function:

def key_func(x):
        return os.path.split(x)[-1]
for filename in sorted(glob.glob('[xy]/*.txt'), key=key_func):
        print filename
# x/blub01.txt
# x/blub02.txt
# x/blub03.txt
# y/blub05.txt
# x/blub10.txt

EDIT: Possibly this key function can sort your files:

pat=re.compile("(\d+)\D*$")
...
def key_func(x):
        mat=pat.search(os.path.split(x)[-1]) # match last group of digits
        if mat is None:
            return x
        return "{:>10}".format(mat.group(1)) # right align to 10 digits.

It sure can be improved, but I think you get the point. Paths without numbers will be left alone, paths with numbers will be converted to a string that is 10 digits wide and contains the number.

glob.glob(os.path.join( '*.txt'))

returns a list of strings, so you can easily sort the list using pythons sorted() function.

sorted(glob.glob(os.path.join( '*.txt')))

You need to change the sort from 'ASCIIBetical' to numeric by isolating the number in the filename. You can do that like so:

import re

def keyFunc(afilename):
    nondigits = re.compile("\D")
    return int(nondigits.sub("", afilename))

filenames = ["file10.txt", "file11.txt", "file9.txt"]

for x in sorted(filenames, key=keyFunc):
   print xcode here

Where you can set filenames with the result of glob.glob("*.txt");

Additinally the keyFunc function assumes the filename will have a number in it, and that the number is only in the filename. You can change that function to be as complex as you need to isolate the number you need to sort on.

for fname in ['file9.txt','file10.txt','file11.txt']:
   with open(fname) as f: # default open mode is for reading
      for line in f:
         # do something with line
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top