Question

In Python (and others), you can incrementally process large volumes of data by using the 'yield' operator in a function. What would be the similar way to do so in PHP?

For example, lets say in Python, if I wanted to read a potentially very large file, I could work on each line one at a time like so (this example is contrived, as it is basically the same thing as 'for line in file_obj'):

def file_lines(fname):
    f = open(fname)
    for line in f:
        yield line
    f.close()

for line in file_lines('somefile'):
    #process the line

What I'm doing right now (in PHP) is I'm using a private instance variable to keep track of state, and acting accordingly each time the function is called, but it seems like there must be a better way.

Was it helpful?

Solution

PHP has a direct equivalent called generators.

Old (pre php 5.5 answer):

Unfortunately, there isn't a language equivalent. The easiest way is to either to what you're already doing, or to create a object that uses instance variables to maintain state.

There is however a good option if you want to use the function in conjunction with the foreach-statement: SPL Iterators. They can be used to achieve something quite similar to python generators.

OTHER TIPS

There is a rfc at https://wiki.php.net/rfc/generators adressing just that, which might be included in PHP 5.5.

In the mean time, check out this proof-of-concept of a poor mans "generator function" implemented in userland.

namespace Functional;

error_reporting(E_ALL|E_STRICT);

const BEFORE = 1;
const NEXT = 2;
const AFTER = 3;
const FORWARD = 4;
const YIELD = 5;

class Generator implements \Iterator {
    private $funcs;
    private $args;
    private $key;
    private $result;

    public function __construct(array $funcs, array $args) {
        $this->funcs = $funcs;
        $this->args = $args;
    }

    public function rewind() {
        $this->key = -1;
        $this->result = call_user_func_array($this->funcs[BEFORE], 
                                             $this->args);
        $this->next();
    }

    public function valid() {
        return $this->result[YIELD] !== false;
    }

    public function current() {
        return $this->result[YIELD];
    }

    public function key() {
        return $this->key;
    }

    public function next() {
        $this->result = call_user_func($this->funcs[NEXT], 
                                       $this->result[FORWARD]);
        if ($this->result[YIELD] === false) {
            call_user_func($this->funcs[AFTER], $this->result[FORWARD]);
        }
        ++$this->key;
    }
}

function generator($funcs, $args) {
    return new Generator($funcs, $args);
}

/**
 * A generator function that lazily yields each line in a file.
 */
function get_lines_from_file($file_name) {
    $funcs = array(
        BEFORE => function($file_name) { return array(FORWARD => fopen($file_name, 'r'));   },
        NEXT   => function($fh)        { return array(FORWARD => $fh, YIELD => fgets($fh)); },
        AFTER  => function($fh)        { fclose($fh);                                       },
    );
    return generator($funcs, array($file_name));
}

// Output content of this file with padded linenumbers.
foreach (get_lines_from_file(__FILE__) as $k => $v) {
    echo str_pad($k, 8), $v;
}
echo "\n";

I prototype everything in Python before implementing in any other languages, including PHP. I ended up using callbacks to achieve what I would with the yield.

function doSomething($callback) 
{
    foreach ($something as $someOtherThing) {
        // do some computations that generates $data

        call_user_func($callback, $data);
    }
}

function myCallback($input)
{
    // save $input to DB 
    // log
    // send through a webservice
    // etc.
    var_dump($input);
}


doSomething('myCallback');

This way each $data is passed to the callback function and you can do what you want.

Extending @Luiz's answer - another cool way is to use anonymous functions:

function iterator($n, $cb)
{
    for($i=0; $i<$n; $i++) {
        call_user_func($cb, $i);
    }
}

$sum = 0;
iterator(10,
    function($i) use (&$sum)
    {
        $sum += $i;
    }
);

print $sum;

There may not be an equivalent operator, but the following code is equivalent in function and overhead:

function file_lines($file) {
  static $fhandle;

  if ( is_null($fhandle) ) {
    $fhandle = fopen($file, 'r');

    if ( $fhandle === false ) {
      return false;
    }
  }

  if ( ($line = fgets($fhandle))!== false ) {
    return $line;
  }


  fclose($fhandle);
  $fhandle = null;
}

while ( $line = file_lines('some_file') ) {
  // ...
}

That looks about right. Sorry, I haven't tested it.

The same sentence 'yield' exists now on PHP 5.5:

http://php.net/manual/en/language.generators.syntax.php

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top