Question

For one off my projects I need to import a very huge text file ( ~ 950MB ). I'm using Symfony2 & Doctrine 2 for my project.

My problem is that I get errors like:

Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 24 bytes)

The error even occurs if I increase the memory limit to 1GB.

I tried to analyze the problem by using XDebug and KCacheGrind ( as part of PHPEdit ), but I don't really understand the values :(

I'am looking for a tool or a method (Quick & Simple due to the fact that I don't have much time) to find out why memory is allocated and not freed again.

Edit

To clear some things up here is my code:

$handle = fopen($geonameBasePath . 'allCountries.txt','r');

        $i = 0;
        $batchSize = 100;

        if($handle) {
            while (($buffer = fgets($handle,16384)) !== false) {

                if( $buffer[0] == '#') //skip comments
                    continue;
                //split parts
                $parts = explode("\t",$buffer);


                if( $parts[6] != 'P')
                    continue;

                if( $i%$batchSize == 0 )    {
                    echo 'Flush & Clear' . PHP_EOL;
                    $em->flush();
                    $em->clear();
                }

                $entity = $em->getRepository('MyApplicationBundle:City')->findOneByGeonameId( $parts[0] );
                if( $entity !== null)   {
                    $i++;
                    continue;
                }

                //create city object
                $city = new City();

                $city->setGeonameId( $parts[0] );
                $city->setName( $parts[1] );
                $city->setInternationalName( $parts[2] );
                $city->setLatitude($parts[4] );
                $city->setLongitude( $parts[5] );
                $city->setCountry( $em->getRepository('MyApplicationBundle:Country')->findOneByIsoCode( $parts[8] ) );

                $em->persist($city);

                unset($city);
                unset($entity);
                unset($parts);
                unset($buffer);

                echo $i . PHP_EOL;


                $i++;
            }
        }

        fclose($handle);

Things I have tried, but nothing helped:

  1. Adding second parameter to fgets
  2. Increasing memory_limit
  3. Unsetting vars
Was it helpful?

Solution

Increasing memory limit is not going to be enough. When importing files like that, you buffer the reading.

$f = fopen('yourfile');
while ($data = fread($f, '4096') != 0) {
    // Do your stuff using the read $data
}
fclose($f);

Update :

When working with an ORM, you have to understand that nothing is actually inserted in the database until the flush call. Meaning all those objects are stored by the ORM tagged as "to be inserted". Only when the flush call is made, the ORM will check the collection and start inserting.

Solution 1 : Flush often. And clear.

Solution 2 : Don't use the ORM. Go for plain SQL command. They will take up far less memory than the object + ORM solution.

OTHER TIPS

33554432 are 32MB

change memory limit in php.ini for example 75MB

memory_limit = 75M

and restart server

Instead of simply reading the file, you should read the file line by line. Every time you do read the one line you should process your data. Do NOT try to fit EVERYTHING in memory. You will fail. The reason for that is that while you can put the TEXT file in ram, you will not be able to also have the data as php objects/variables/whathaveyou at the same time, since php by itself needs much larger amounts of memory for each of them.

What I instead suggest is a) read a new line, b) parse the data in the line c) create the new object to store in the database d) goto step a, by unset(ting) the old object first or reusing it's memory

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top