Question

I am using the current code to read a csv file and add it to an array:

    echo "starting CSV import<br>";
    $current_row = 1; 
    $handle = fopen($csv, "r"); 
    while ( ($data = fgetcsv($handle, 10000, ",") ) !== FALSE ) 
    { 
        $number_of_fields = count($data); 
        if ($current_row == 1) { 
        //Header line 
            for ($c=0; $c < $number_of_fields; $c++) 
            { 
                $header_array[$c] = $data[$c]; 
            } 
        } else { 
        //Data line 
            for ($c=0; $c < $number_of_fields; $c++) 
            { 
                $data_array[$header_array[$c]] = $data[$c]; 
            } 

            array_push($products, $data_array);

        } 
        $current_row++; 
    } 
    fclose($handle); 
    echo "finished CSV import <br>";

However when using a very large CSV this times out on the server, or has a memory limit error.

I'd like a way to do it in stages, so after the first say 100 lines it will refresh the page, starting at line 101.

I will probably be doing this with a meta refresh and a URL parameter.

I just need to know how to adapt that code above to start at the line I tell it to.

I have looked into fseek() but I'm not sure how to implement this here.

Can you please help?

Was it helpful?

Solution

The timout can be circumvented using

ignore_user_abort(true);
set_time_limit(0);

When experiencing problems with the memory limit, it may be wise to take a step back and look at what you're actually doing with the data you're processing. Are you pushing the data into a database? calculate something off the data but don't need to store the actual data, …

Do you really need to push (array_push($products, $data_array);) the rows into an array (for later processing)? can you instead write to the database directly? or calculate directly? or build an html <table> directly? or whatever the hell you're doing right then an there, within the while() loop, without pushing everything into an array first?

If you're able to chunk the processing, I guess you don't need that array at all. Otherwise you'd have to restore the array for every chunk - not solving the memory issue one bit.

If you can manage to change your processing algorithm to waste less memory / time, you should seriously consider that over any chunked processing requiring a round-trip to the browser (for so many performance and security reasons…).

Anyways, you can, at any time, identify the current stream offset with ftell() and re-set to that position using fseek(). You'd only need to pass that integer to your next iteration.


Also there is no need for your inner for() loops. This should produce the same results:

<?php

$products = array();
$cols = null;
$first = true; 

$handle = fopen($csv, "r"); 
while (($data = fgetcsv($handle, 10000, ",")) !== false)  { 
    if ($first) { 
        $cols = $data;
        $first = false;
    } else { 
        $products[] = array_combine($cols, $data);
    }
}

fclose($handle); 
echo "finished CSV import <br>";
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top