Question

[EDITED OP OUT HERE IS THE SHORT VERSION]

Looping through a file and reading contents, then writing causes the function to fail. It appeared to be a memory issue. This is the three versions I tried.

First tried this:

$file = new SplFileObject($this->getDirectoryPath() . $this->getFileName(), "a+");
$file->setFlags(SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY);

if ($this->exists()) {
    foreach ($file as $line) {
        $tempArray = unserialize($line);
        if ($tempArray['Key'] == $arrayOfData['Key']) {
            foreach ($totalsToBeAdded as $key) {
                $arrayOfData[$key] += $tempArray[$key];
            }
        }
    }
}

$tempString = serialize($arrayOfData);

$file->fwrite("$tempString\r\n");

$this->numLines++;

Then I tried this:

$file = new SplFileObject($this->getDirectoryPath() . $this->getFileName(), "a+");
$file->setFlags(SplFileObject::DROP_NEW_LINE | SplFileObject::SKIP_EMPTY);

if ($this->exists()) {
    while (!$file->eof()) {
        $tempArray = unserialize($file->current());
        if ($tempArray['PartNumber'] == $arrayOfData['PartNumber']) {
            foreach ($totalsToBeAdded as $key) {
                $arrayOfData[$key] += $tempArray[$key];
            }
        }

        $file->next();
    }
}

$tempString = serialize($arrayOfData);

$file->fwrite("$tempString\r\n");

$this->numLines++;

And finally I abandoned SplFileObject and just went with normal fopen etc:

$handle = fopen($this->getDirectoryPath() . $this->getFileName(), "a+");

if ($this->exists()) {
    while (false !== ($line = fgets($handle))) {
        $tempArray = unserialize(trim($line));
        if ($tempArray['Key'] == $arrayOfData['Key']) {
            foreach ($totalsToBeAdded as $key) {
                $arrayOfData[$key] += $tempArray[$key];
            }
        }
    }
}

$tempString = serialize($arrayOfData);
fwrite($handle, "$tempString\r\n");
fclose($handle);
$this->numLines++;

EDIT FOR MORE INFO:

I was curious if the underlying code of PHP used array for the iterators when stepping line by line through a file, which could kill it.

Also the file does begin building, I can watch it write till it gets to about 500-600k then it dies.

The final file size will be around 10mb.

One final update:

This works (notice lack of openning and reading file):

public function writeUnique($arrayOfData, $totalsToBeAdded) {  
        $tempArray = array();

        $handle = fopen($this->fullPath, "a+");

        $tempString = serialize($arrayOfData);
        fwrite($handle, "$tempString\r\n");
        fclose($handle);
        $this->numLines++;
}

While this breaks (notice ALL that is being done is looping through the whole file THEN writing to the file):

public function writeUnique($arrayOfData, $totalsToBeAdded) {  
        $tempArray = array();

        $handle = fopen($this->fullPath, "a+");

        if ($this->exists()) {
            while (false !== ($line = fgets($handle))) {

            }
        }

        $tempString = serialize($arrayOfData);
        fwrite($handle, "$tempString\r\n");
        fclose($handle);
        $this->numLines++;
}

UPDATE NUMBER THREE:

I have now tested this:

public function writeUnique($arrayOfData, $totalsToBeAdded) {

    $handle = fopen($this->fullPath, "a+");

    if ($this->exists()) {
        while (false !== ($line = fgets($handle))) {

        }
    }

    $tempString = serialize($arrayOfData);
//        fwrite($handle, "$tempString\r\n"); Commented out the writing.
    fclose($handle);
    $this->numLines++;
}

This worked. No failure, memory error or other wise.

So, it appears that it is either a problem with iterations of rereading the same lines of a large file, OR the write portion of the function is in some way steping on the toes of the read function.. which honestly doesn't make sense. I know everyone was thinking it has something to do with my arrays. But I've preaty much taken out ALL my logic and I'm just trying to read/write a large file.

Was it helpful?

Solution 2

So I finally just broke down and did the math to figure out how many loops I'm requiring php to complete on this file, and the number is 8,788,338,000,000 times.

This in turn caused the PHP to time out. To keep it from timing out this line of code needed to be added.

set_time_limit(0); // ignore php timeout

Now the temp files can all be read and parsed line by line. However, on large files (10 mb +), the time to complete the function is well over an hour so far (it's still running as I can see the temp file growing larger).

I have come to the conlusion that if speed is of the essence, then it will probably be better to store LARGE data sets into a temporary SQL table. This previously wasn't an option for me, but now I'm forcing the issue with the powers that be to allow it. Worst case senerio this will atleast allow it to run.

BE WARNED: THIS WILL ALLOW AN INFINITE LOOP TO RUN FOREVER AND POSSIBLY KILL THE SERVER. MAKE SURE YOU KNOW HOW TO KILL THE PROCESS THROUGH UNIX BEFORE ATTEMPTING.

OTHER TIPS

Try:

if ($this->exists()) {
    while (false !== ($line = fgets($handle))) {
        $tempArray = unserialize(trim($line));
        unset($line);
        if ($tempArray['Key'] == $arrayOfData['Key']) {
            foreach ($totalsToBeAdded as $key) {
                $arrayOfData[$key] += $tempArray[$key];
            }
        }
        unset($tempArray);
    }
}

The only persistent arrays I can see here are $totalsToBeAdded and $arrayOfData, which looks to be one dimensional from your += operator, so there isn't much you can do but micro-optimize.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top