Python: flush a buffer before program termination via a finalizer
-
23-08-2019 - |
Question
I keep a cache of transactions to flush (to persistent storage) on the event of a watermark or object finalization. Since __del__
is no longer guaranteed to be called on every object, is the appropriate approach to hook a similar function (or __del__
itself) into atexit.register
(during initialization)?
If I'm not mistaken, this will cause the object to which the method is bound to hang around until program termination. This isn't likely to be a problem, but maybe there's a more elegant solution?
Note: I know using __del__
is non-ideal because it can cause uncatchable exceptions, but I can't think of another way to do this short of cascading finalize()
calls all the way through my program. TIA!
Solution
If you don't need your object to be alive at the time you perform the flush, you could use weak references
This is similar to your proposed solution, but rather than using a real reference, store a list of weak references, with a callback function to perform the flush. This way, the references aren't going to keep those objects alive, and you won't run into any circular garbage problems with __del__
methods.
You can run through the list of weak references on termination to manually flush any still alive if this needs to be guaranteed done at a certain point.
OTHER TIPS
If you have to handle ressources the prefered way is to have an explicit call to a close()
or finalize()
method. Have a look at the with
statement to abstract that. In your case the weakref
module might be an option. The cached object can be garbage collected by the system and have their __del__()
method called or you finalize them if they are still alive.
I would say atexit
or try and see if you can refactor the code into being able to be expressed using a with_statement
which is in the __future__
in 2.5 and in 2.6 by default. 2.5 includes a module contextlib to simplify things a bit. I've done something like this when using Canonical's Storm ORM.
from future import with_statement
@contextlib.contextmanager
def start_transaction(db):
db.start()
yield
db.end()
with start_transaction(db) as transaction:
...
For a non-db case, you could just register the objects to be flushed with a global and then use something similar. The benefit of this approach is that it keeps things explicit.
Put the following in a file called destructor.py
import atexit
objects = []
def _destructor():
global objects
for obj in objects:
obj.destroy()
del objects
atexit.register(_destructor)
now use it this way:
import destructor
class MyObj(object):
def __init__(self):
destructor.objects.append(self)
# ... other init stuff
def destroy(self):
# clean up resources here
I think atexit
is the way to go here.