readdir()
doesn't return entries in any particular order. As others mentioned, the order will depend on the particular file system in question.
For example, the Berkeley UFS file system uses an unsorted linked-list. See the description of the direct
structure on page 744 of http://ptgmedia.pearsoncmg.com/images/0131482092/samplechapter/mcdougall_ch15.pdf. The binary content of a directory consists of a stream of variable-length records, each of which contains the inode number, record length, string length (of the filename) and the string data itself. readdir()
works by walking the linked list (using the record length to know where each record begins relative to the previous record) and returning whatever it finds.
The list of records is not typically optimized, so filenames appear on the list (more or less) in the order the files were created. But not quite, because holes (resulting from deleted files) will be filled with new filenames if they are small enough to fit.
Now, not all file systems represent directories the way UFS does. A file system that keeps directory data in a binary tree may choose to implement readdir()
as an in-order traversal of that tree, which would present files sorted by whatever attributes it uses as key for the tree. Or it might use a pre-order traversal, which would not return the records in a sorted order.
Because applications can not know the nature of the file system's implementation (and that each mounted volume can potentially use a different file system), applications should never assume anything about the order of entries that readdir()
returns. If they require the entries to be sorted, they must read the entire directory into memory and do their own sorting.
This is why, for example, the ls
command can take a long time to display output when run against a large directory. It needs to sort the entire list of names (and determine the longest name, in order to compute the column width) before it can display any output. This is also why ls -1U
(disable sorting and display in one column) will produce output immediately on such directories.