Question

I want to save the time and mark the object as modified, so I wrote a class and override its __setattr__ function.

import time

class CacheObject(object):
    __slots__ = ('modified', 'lastAccess')
    def __init__(self):
        object.__setattr__(self,'modified',False)
        object.__setattr__(self,'lastAccess',time.time())

    def setModified(self):
        object.__setattr__(self,'modified',True)
        object.__setattr__(self,'lastAccess',time.time())

    def resetTime(self):
        object.__setattr__(self,'lastAccess',time.time())

    def __setattr__(self,name,value):
        if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value: 
            object.__setattr__(self,name,value)
            self.setModified()

class example(CacheObject):
    __slots__ = ('abc',)
    def __init__(self,i):
        self.abc = i
        super(example,self).__init__()

t = time.time()
f = example(0)
for i in range(100000):
    f.abc = i

print(time.time()-t)

I measured the process time, and it took 2 seconds. When I commented out overridden function, the process time was 0.1 second, I know the overridden function would be slower but almost 20 times the gap is too much. I think I must get something wrong.

take the suggestion from cfi

1.elimate the if condition

    def __setattr__(self,name,value):
#        if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value: 
            object.__setattr__(self,name,value)
            self.setModified()

the running time down to 1.9, a little improve but mark the object modified if it's not changed would cost more in other process, so not an option.

2.change self.func to classname.func(self)

def __setattr__(self,name,value):
    if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value: 
        object.__setattr__(self,name,value)
        CacheObject.setModified(self)

running time is 2.0 .so nothing really changed

3)extract setmodified function

def __setattr__(self,name,value):
    if (not hasattr(self,name)) or object.__getattribute__(self,name)!=value: 
        object.__setattr__(self,name,value)
        object.__setattr__(self,'modified',True)
        object.__setattr__(self,'lastAccess',time.time())

running time down to 1.2!!That's great ,it do save almost 50% time,though the cost is still high.

Was it helpful?

Solution

Not a complete answer but some suggestions:

  1. Can you eliminate the value comparison? Of course that's a feature change of your implementation. But the overhead in runtime will be even worse if more complex objects than integers are being stored in attributes.

  2. Every call to a method via self needs to go through full method resolution order checking. I don't know if Python could do any MRO caching itself. Probably not because of the types-being-dynamic principle. Thus, you should be able to reduce some overhead by changing any self.method(args) to classname.method(self, args). That removes the MRO overhead from the calls. This applies to self.setModified() in your settattr() implementation. In most places you have done this already with references to object.

  3. Every single function call takes time. You could eliminate them and e.g. move setModified's functionality into __setattr__ itself.

Let us know how the timing changes for each of these. I'd split out the experiment.

Edit: Thanks for the timing numbers.

The overhead may seem drastic (still a factor of 10 it seems). However put that into perspective of overall runtime. In other words: How much of you overall runtime will be spent in setting those tracked attributes and how much time is spent elsewhere?

In a single-thread application Amdahl's Law is a simple rule to set expectations straight. An illustration: If 1/3 of the time is spend setting attributes, and 2/3 doing other stuff. Then slowing down the attribute setting by 10x will only slow down the 30%. The smaller the percentage of time spent with the attributes, the less we have to care. But this may not help you at all if your percentage is high...

OTHER TIPS

Overriding __setattr__ here seems to have no function. You only have two attributes, modified and lastAccess. That means this are the only attributes you can set, so why would you override __setattr__? Just set the attributes directly.

If you want something to happen when setting an attribute, make it a property with a setter and a getter. It's easier and much less magical.

class CacheObject(object):
    __slots__ = ('modified', 'lastAccess')

    def __init__(self):
        self.modified = False
        self.lastAccess = time.time()

    def setModified(self):
        self.modified = True
        self.lastAccess = time.time()

    def resetTime(self):
        self.lastAccess = time.time()

class example(CacheObject):
    __slots__ = ('_abc',)
    def __init__(self,i):
        self._abc = i
        super(example,self).__init__()

    @property
    def abc(self):
        self.resetTime()
        return self._abc


    @abc.setter
    def abc(self, value):
        self.setModified()
        self._abc = value

Old question but worth an update.

I ran into the same problem with pydantic using python 3.6.

object.__setattr__(self, name, value) just is slower than setting an attribute on a class normally. No apparent way round that.

If performance is important the only option is to keep calls to object.__setattr__(self, name, value) to an absolute minimum in classes where you need to override _setattr_.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top