Question

Having the following snippet to map the contents of the current directory recursively:

$files = new RecursiveIteratorIterator
(
    new RecursiveDirectoryIterator('./',
        FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS),
    RecursiveIteratorIterator::SELF_FIRST
);

$files = array_values(array_map('strval', iterator_to_array($files)));

Returns something like this:

Array
(
    [0] => ./1.png
    [1] => ./a.php
    [2] => ./adminer
    [3] => ./adminer/adminer.css
    [4] => ./adminer/adminer.php
)

Is there any way I can get RecursiveDirectoryIterator / FilesystemIterator to emulate the GLOB_MARK behavior that exists in the glob() function? From the manual:

GLOB_MARK - Adds a slash to each directory returned.

I know I could emulate that by simply doing:

foreach ($files as $key => $value)
{
    $files[$key] .= (is_dir($value) ? '/' : '');
}

But that would require (unnecessarily?) hitting the disk lots of times. I'm looking for a way to quickly determine if the path is a directory or a regular file, and the ending slash seems like the ideal solution.

I plan to traverse tens (if not hundreds) of thousands of files with this, so performance is critical.

Bonus question: Is there any way to get only the directories (recursively)?

Was it helpful?

Solution

You can create your own GlobMarkIterator advantages :

  • Returns Ending Slashes to directory just as GLOB_MARK
  • No need of using array_map with strval to convert it to string
  • No extra foreach loop with is_dir
  • Still as fast as the original
  • Yes I know I cheated

Example

$ri = new RecursiveIteratorIterator(new GlobMarkIterator('./', FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS), RecursiveIteratorIterator::SELF_FIRST);
$files = array_values(iterator_to_array($ri));

echo "<pre>";
print_r($files);

Output

Array
(
    [0] => ./test/backups/ <----------- Note ending slash 
    [1] => ./test/CSV/
    [2] => ./test/CSV/abc.csv
    [3] => ./test/final/
    [4] => ./test/thumb/
    [5] => ./test/thumb/a.png
    [6] => ./test/thumb/s.svg
    [7] => ./test/thumb/sample.svg
)



Bonus question: Is there any way to get only the directories (recursively)?

This should have been another question but all the same .... I hope you don't get satisfied and put a bounty on this

Solution :

$ri = new RecursiveIteratorIterator( new GlobMarkDirectory('./test'), RecursiveIteratorIterator::SELF_FIRST);
$dir = array_values(iterator_to_array($ri));

echo "<pre>";
print_r($dir);

Output

Array
(
    [0] => ./test/backups/
    [1] => ./test/CSV/
    [2] => ./test/final/
    [3] => ./test/thumb/
)

Class Used

GlobMarkIterator

class GlobMarkIterator extends RecursiveDirectoryIterator {
    function current() {
        return $this->isDir() ? $this->getPathname() . "/" : $this->getPathname();
    }
}

GlobMarkDirectory Class

class GlobMarkDirectory  extends RecursiveFilterIterator {
    public function __construct($path) {
        parent::__construct(new GlobMarkIterator($path, FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS));
    }
    public function accept() {
        return $this->getInnerIterator()->isDir();
    }
    public function getChildren() {
        return new GlobMarkDirectory($this->getInnerIterator()->getPathname());
    }
}



EDIT .. if you don't care about empty dir and you don't want to use isDir due to speed and overhead here is another solution

Solution

$ri = new RecursiveIteratorIterator(new GlobMarkFastDirectory  (__DIR__), RecursiveIteratorIterator::SELF_FIRST);
$dir = array_values(array_unique(iterator_to_array($ri)));

GlobMarkFastDirectory

class GlobMarkFastDirectory  extends RecursiveDirectoryIterator {
    function current() {
        return dirname($this->getPathname())  ."/";
    }
}

OTHER TIPS

This is my best effort so far:

$files = new RecursiveIteratorIterator
(
    new RecursiveDirectoryIterator
    (
        str_replace('\\', '/', realpath('./')),
        FilesystemIterator::SKIP_DOTS | FilesystemIterator::UNIX_PATHS
    ),
    RecursiveIteratorIterator::LEAVES_ONLY
);

$files = array_keys(iterator_to_array($files));
$folders = array();

/*
foreach ($files as $key => $value) // not needed anymore
{
    $files[$key] .= (is_dir($value) === true) ? '/' : '';
}
*/

$files = array_flip($files);

foreach ($files as $key => $value)
{
    $folder = dirname($key) . '/'; // doesn't issue a stat call

    if (array_key_exists($folder, $folders) !== true)
    {
        $folders[$folder] = 0;
    }

    $folders[$folder] += $files[$key] = sprintf('%u', filesize($key));
}

I could merge the $folders array with the $files array to answer the question precisely, but distinguishing one of the other was exactly my main objective, so there's no point in doing that.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top