Given your sample code:
def main(init):
def report(x):
print x
bigop(init, report)
However, I don't think that's what you're looking for here. Presumably you want report
to feed data into view
in some way.
You can do that by turning things around—instead of view
being a generator that drives another generator, it's a generator that's driven by an outside caller calling send
on it. For example:
def view():
while True:
value = yield
print value
def main(init):
v = view()
v.next()
def report(x):
v.send(x)
bigop(init, report)
But you said that view
can't be changed. Of course you can write a viewdriver
that yield
s a new object whenever you send
it one. Or, more simply, just repeatedly call view([data])
and let it iterate over a single object.
Anyway, I don't see how you expect this to help anything. bigop
is not a coroutine, and you cannot turn it into one. Given that, there's no way to force it to cooperatively share with other coroutines.
If you want to interleave processing and reporting concurrently, you have to use threads (or processes). And the fact that "REPORT must finish at each step before BIGOP continues" is already part of your requirements implies that you can't safely do anything concurrent here anyway, so I'm not sure what you're looking for.
If you just want to interleave processing and reporting without concurrency—or periodically hook into bigop
, or other similar things—you can do that with a coroutine, but it will have exactly the same effect as using a subroutine—the two examples above are pretty much equivalent. So, you're just adding complexity for no reason.
(If bigop
is I/O bound, you could use greenlets, and monkeypatch the I/O operations to asyncify them, as gevent
and eventlet
do. But if it's CPU-bound, there would be no benefit to doing so.)
Elaborating on the viewdriver
idea: What I was describing above was equivalent to calling view([data])
each time, so it won't help you. If you want to make it an iterator, you can, but it's just going to lead to either blocking bigop
or spinning view
, because you're trying to feed a consumer with a consumer.
It may be hard to understand as a generator, so let's build it as a class:
class Reporter(object):
def __init__(self):
self.data_queue = []
self.viewer = view(self)
def __call__(self, data):
self.data_queue.append(data)
def __iter__(self):
return self
def __next__(self):
return self.data_queue.pop()
bigop(init, Reporter())
Every time bigop
calls report(data)
, that calls our __call__
, adding a new element to our queue. Every time view
goes through the loop, it calls our __next__
, popping an element off the queue. If bigop
is guaranteed to go faster than view
, everything will work, but the first time view
gets ahead, it will get an IndexError
.
The only way to fix that is to make __next__
try until data_queue
is non-empty. But just doing that will spin forever, not letting bigop
do the work to produce a new element. And you can't make __next__
into a generator, because view
is expecting an iterator over values, not an iterator over iterators.
Fortunately, __call__
can be a generator, because bigop
doesn't care what value it gets back. So, you can turn things around. But you can't do that, because then there's nothing to drive that generator.
So, you have to add another level of coroutines, underneath the iteration. Then, __next__
can wait on a next_coro
(by calling next
on it), which yields to a call_coro
and then yields the value it got. Meanwhile, __call__
has to send
to the same call_coro
, wait on it, and yield.
So far, that doesn't change anything, because you've got two routines both trying to drive next_coro
, and one of them (__next__
) isn't blocking anywhere else, so it's just going to spin—it's next
call will look like a send(None)
from __call__
.
The only way to fix that is to build a trampoline (PEP 342 includes source for a general-purpose trampoline, although in this case you could build a simpler special-purpose one), schedule next_coro
and call_coro
to explicitly alternate, make sure next_coro
properly handles alternating between two different entry points, then drive the scheduler's run
from __next__
(and __init__
).
Confused? You won't be, after this week's episode of… Nah, who am I kidding. You're going to be confused. Writing all of this is one thing; debugging it is another. (Especially since every important stack trace just terminates immediately at the trampoline.) And what does all that work get you? The exact same benefit as using greenlets or threads, with the exact same downsides.
Since your original question is whether there's a simpler way than using threads, the answer is: No, there isn't.