Question

I want to replicate boolean NA values as they behave in R:

NA is a valid logical object. Where a component of x or y is NA, the result will be NA if the outcome is ambiguous. In other words NA & TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. http://stat.ethz.ch/R-manual/R-devel/library/base/html/Logic.html

I have seen None being recommended for missing values, but Python converts None to False when evaluating boolean expressions, and computes None or False to False. The result should of course have been None, as no conclusions can be made given the missing value.

How do I achieve this in Python?

EDIT Accepted answer computes correctly with bitwise boolean operators, but to achieve the same behavior with logical operators not, or and and, seems to require a change in the Python programming language.

Was it helpful?

Solution

As other have said, you can define your own class.

class NA_(object):
    instance = None # Singleton (so `val is NA` will work)
    def __new__(self):
        if NA_.instance is None:
            NA_.instance = super(NA_, self).__new__(self)
        return NA_.instance
    def __str__(self): return "NA"
    def __repr__(self): return "NA_()"
    def __and__(self, other):
        if self is other or other:
            return self
        else:
            return other
    __rand__ = __and__
    def __or__(self, other):
        if self is other or other:
            return other
        else:
            return self
    __ror__ = __or__
    def __xor__(self, other):
        return self
    __rxor__ = __xor__
    def __eq__(self, other):
        return self is other
    __req__ = __eq__
    def __nonzero__(self):
        raise TypeError("bool(NA) is undefined.")
NA = NA_()

Use:

>>> print NA & NA
NA
>>> print NA & True
NA
>>> print NA & False
False
>>> print NA | True
True
>>> print NA | False
NA
>>> print NA | NA
NA
>>> print NA ^ True
NA
>>> print NA ^ NA
NA
>>> if NA: print 3
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 28, in __nonzero__
TypeError: bool(NA) is undefined.
>>> if NA & False: print 3
...
>>>
>>> if NA | True: print 3
...
3
>>>

OTHER TIPS

You can do this by creating a class and overriding the boolean operation methods.

>>> class NA_type(object):
        def __and__(self,other):
                if other == True:
                        return self
                else:
                        return False
        def __str__(self):
                return 'NA'


>>> 
>>> NA = NA_type()
>>> print NA & True
NA
>>> print NA & False
False

You can define a custom class (singleton?) and define custom __and__ (and whatever other you neeed) function. See this:

http://docs.python.org/2/reference/datamodel.html#emulating-numeric-types

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top