Question

I'm writing a test system that uses, amongst other things, a data source. When running, it will read a bunch of instruments, but for testing and development of the back end, I want it to read a file, or return random numbers. In the future, I know that new data sources will need to be created whose operation is as yet unknown. I am trying to set the system up so that I can move on to a new department, and not have to come back and support it, so want to be as Pythonic and leave as few surpises as possible. A main requirement is to have a consistent API for the source, and ABC seems the obvious choice here. There's not sufficient common between the sources to have any worthwhile chunk in a base class to be inherited from.

I don't want a large source module that selects what to do, I want small standalone sources that can be chosen from, so the old ones that work can be left alone. The question is, I want to be able to select which source to use with a parameter, so I can run the same test script and switch sources easily. I forget exactly how I came across __new__, but it wasn't obvious, and few other people had heard of it. It works. But is the obvious or Pythonic way to do what I'm trying to do? Is there a way of doing this that is more familiar to my colleagues? I should point out that I am now working at slightly above my comfort level for meta-programming, so anything more complicated will probably whoosh straight over my head.

from abc import ABCMeta, abstractmethod
import random

class BaseSource:
    __metaclass__ = ABCMeta

    @abstractmethod
    def get(self):
        pass    

class ManualSrc(BaseSource):
    def get(self):
        return float(raw_input('gimme a number - '))

class RandomSrc(BaseSource):
    def get(self):
        return random.random()

class Source(BaseSource):
    """generic Source choice"""
    def __new__(BaseSource, choice):
        if choice == 0:
            return ManualSrc()
        elif choice == 1:
            return RandomSrc()
        else:
            raise ValueError('source choice parameter {} not valid'.format(choice))

if __name__ == '__main__':
    for use_src in range(4):
        print 'using source choice {}'.format(use_src)
        src = Source(use_src)
        print src.get()
Was it helpful?

Solution

This is not quite an answer... Is more a code review, almost, so I'd probably wait for a different opinion.

I've (personally speaking... no objective confirmation here) seen __new__ usually utilized to create instances of class when you're using your own __metaclass__(es) (check this answer in S.O. and this great thread about Python's metaclasses)

In your example, since if you add a new source (a new WhateverSrc() thingy) you're going to need editing the __new__ method of your Source class anyway, it looks like it's a bit overkill to use a class inheriting from BaseSource to create other sources. Also, the question is: Is the Source class really a BaseSource? As far as I understood, not really... Source is a factory of sources, right? If that's the case, you can try this implementation, if you want (the link is the answer I mentioned on my second paragraph so I don't have much merit for "finding" it) although factories sound very Java-esque to me. Again, just a personal opinion here.

Instead of the Source(BaseSource) class the way you have it there, I'd go with a simple create_source method:

## [ . . . ]

class RandomSrc(BaseSource):
    def get(self):
        return random.random()

def create_source(choice):
    if choice == 0:
        return ManualSrc()
    elif choice == 1:
        return RandomSrc()
    else:
        raise ValueError('source choice parameter {} not valid'.format(choice))

if __name__ == '__main__':
    for use_src in range(4):
        print 'using source choice {}'.format(use_src)
        src = create_source(use_src)
        print src.get()

And if you need a new source, you'd edit that create_source method like:

## [ . . . ]

class RandomSrc(BaseSource):
    def get(self):
        return random.random()

class WhateverSrc(BaseSource):
    def get(self):
        return "Foo Bar??"

def create_source(choice):
    if choice == 0:
        return ManualSrc()
    elif choice == 1:
        return RandomSrc()
    elif choice == 2:
        return WhateverSrc()
    else:
        raise ValueError('source choice parameter {} not valid'.format(choice))

Or even more... forget about the @abstractmethod completely, and just get a bunch or regular concrete classes. If someones creates a new *Src class which doesn't implement the get method, that person will see a pretty descriptive failure anyways...

import random

class ManualSrc(object):
    def get(self):
        return float(raw_input('gimme a number - '))

class RandomSrc(object):
    def get(self):
        return random.random()

class BreakingSrc(object):
    pass

def create_source(choice):
    if choice == 0:
        return ManualSrc()
    elif choice == 1:
        return RandomSrc()
    elif choice == 2:
        return BreakingSrc()
    else:
        raise ValueError('source choice parameter {} not valid'.format(choice))

if __name__ == '__main__':
    for use_src in range(4):
        print 'using source choice {}'.format(use_src)
        src = create_source(use_src)
        print src.get()

That outputs:

using source choice 0
gimme a number - 1
1.0
using source choice 1
0.702223268052
using source choice 2
Traceback (most recent call last):
  File "./stack26.py", line 28, in <module>
    print src.get()
AttributeError: 'BreakingSrc' object has no attribute 'get'

All that said... Using metaclasses you can register a class in some kind of list or dictionary when you define class Whatever (see this answer), which could also give you some ideas :-)

In your case, following the idea of registering a class through metaclasses, the snippet below works, but as you can see, the code gets more and more confusing:

from abc import ABCMeta, abstractmethod
import random
import inspect

available_srcs = []

def register(newclass):
    if inspect.isabstract(newclass):
        print ("newclass %s is abstract, and has abstract"
                " methods: %s. Refusing to register"
                % (newclass, newclass.__abstractmethods__))
        return
    if newclass not in available_srcs:
        available_srcs.append(newclass)
        print "Registered %s as available source" % newclass

class MyMetaClass(ABCMeta):
    def __new__(cls, clsname, bases, attrs):
        newclass = super(MyMetaClass, cls).__new__(cls, clsname, bases, attrs)
        register(newclass)  # here is your register function
        return newclass

class BaseSource(object):
    __metaclass__ = MyMetaClass

    @abstractmethod
    def get(self):
        pass    

class ManualSrc(BaseSource):
    def get(self):
        return float(raw_input('gimme a number - '))

class RandomSrc(BaseSource):
    def get(self):
        return random.random()

if __name__ == '__main__':
    for use_src in range(4):
        print 'using source choice {}'.format(use_src)
        src = available_srcs[use_src]()
        print src.get()

EDIT 1:

The OP (Neil_UK) asked in a comment to this answer Which will be more confusing, capitalising something that isn't a class, or calling a non-capitalised name to instantiate a specific object?

Before starting, the following examples make thorough use of the built-in type and vars functions. You should make sure you're familiar with what they do before continuing.

To me (and this is just my opinion, since capitalized or non-capitalized function names are both syntactically ok in Python), it would be more confusing having a function in capitalized letters. Bear in mind that you're not actually returning a class (although you could, because class(es) are instances of type type too) What you're returning is an instance, and there's nothing wrong with a function (lower-case according to PEP8 naming convention) returning an instance. That's what the logging module does, for example:

>>> import logging
>>> log = logging.getLogger('hello')
>>> vars(log)
{'name': 'hello', 'parent': <logging.RootLogger object at 0x17ce850>, 'handlers': [], 'level': 0, 'disabled': 0, 'manager': <logging.Manager object at 0x17ce910>, 'propagate': 1, 'filters': []}
>>> type(log)
<class 'logging.Logger'>

Going back to your particular scenario: If I didn't know anything about your code (if I were just importing CreateSource somewhere), and I knew I had to use CreateSource like this: src = CreateSource(use_src) what I would automatically think is that src is an instance of the CreateSource class, and also that the integer I passed in the use_src parameter will be stored in attribute somewhere. Check the example with logging I copied above... the 'hello' string happens to be the name attribute of the log instance that was created through the getLogger function. Ok... nothing weird with the getLogger function.

Let's go to an extreme example. I know is not that you did anything like what I'm about to do, (I think yours is actually a valid concern) but maybe it'll help prove what I mean.

Consider the following code:

 a = A()
 a.x = 5
 print "a.x is %s" % a.x

I you just saw that, what would you think it's happening there? You'd think that you're creating an empty instance of a class A, and setting its x attribute to 5, so you'd expect the print to output a.x is 5, right?

Wrong. Here's what's going on (totally proper Python):

class B(object):
    def __init__(self):
        self.x = 10
    @property
    def x(self):
        return "I ain't returning x but something weird, and x is %s... FYI"\
                % self._x
    @x.setter
    def x(self, x):
        self._x = int(self._x if hasattr(self, '_x') else 0 + 2 * x)

def A():
    return B()

So a is actually an instance of class B and because of the ability that Python provides to "mask" getters and setters through properties, I am creating a horrible mess that is not intuitive at all. You'll hear a lot of times when dealing with Python that the fact that you can actually do something doesn't mean you should do it. I personally always quote Uncle Ben: With great power comes great responsibility (well... or Voltaire, but meh, I find quoting Uncle Ben cooler, whaddup!!? :-D)

This said, you may wanna create a user in https://codereview.stackexchange.com/ I'm sure there is a lot of knowledgeable people that can answer this kind of questions much better than me.

Oh, before I mentioned that class is also an instance. Wait, woooot?? Yep. And functions are instances too!!. Check this out:

>>> class C(object):
...     pass
... 
>>> vars(C)
dict_proxy({'__dict__': <attribute '__dict__' of 'C' objects>, '__module__': '__main__', '__weakref__': <attribute '__weakref__' of 'C' objects>, '__doc__': None})
>>> type(C)
<type 'type'>
>>> def get_me_a_c_class():
...     return C
... 
>>> my_class = get_me_a_c_class()
>>> my_instance = my_class()
>>> type(my_instance)
<class '__main__.C'>
>>> type(get_me_a_c_class)
<type 'function'>
>>> vars(get_me_a_c_class)
{}
>>> get_me_a_c_class.random_attribute = 5
>>> print "Did I just put an attribute to a FUNCTION??: %s" % get_me_a_c_class.random_attribute
Did I just put an attribute to a FUNCTION??: 5

In the few years I've been dealing with Python, I've found that it heavily relies in the common sense of the programmers. And while I was initially hesitant to believe that this paradigm wouldn't lead to horrible messes, it turns out it doesn't (in most of the cases ;-) ).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top