Question

I'm implementing a linear algebra library in Python (I know something probably exists, but I'm doing it to learn about Python and the maths I need for my exams), and I want to be able to access elements/subsets of a matrix like so:

(My matrix class is a subclass of tuple.)

  • M = Matrix([list of rows of elements])
  • M[1, 2] Gets element at (1, 2)
  • M[3] Gets row 3

These are pretty easy to do, but I also want to implement slicing, like so:

  • M[:,:] returns the entire matrix
  • M[1:6:2] returns the rows 1, 3 and 5
  • M[1:6:2, 0:2] returns a matrix consisting of rows 1, 3 and 5 intersected with the first two columns.

I have done this, but my answer seems very un-pythonic:

def __getitem__ (self, idx):
    if isinstance(idx, numbers.Integral):
        # Code to return the row at idx
    elif (isinstance(idx, tuple) and len(idx) == 2 and
            all(isinstance(i, numbers.Integral) for i in idx)):
        # Code to return element at idx
    elif (isinstance(idx, tuple) and len(idx) == 2 and
            all(isinstance(i, slice) for i in idx)):
        # Code to parse slices

The other problem with this is that both indices have to be numbers, or slices, I can't mix. To do it this way would require two more elif blocks, and that seems like two many. Already the code is really ugly.

I think the answer involves duck typing, but I'm not completely sure how to implement that. I've been looking at try:except: blocks, but I'm not sure how to chain those, and I don't really want to nest too much.

So, SO, thank you for reading. What is the best way to implement a function like this?

Was it helpful?

Solution

You pretty much have to do something like this… but at least you can remove some duplication.

First, it's probably reasonable to consider [1,] to mean "row 1", just like [1]. (numpy does this.) That means you don't need the tuple-vs.-int thing; just treat an int as a 1-element tuple. In other words:

def __getitem__(self, idx):
    if isinstance(idx, numbers.Integral):
        idx = (idx, slice(None, None, None))
    # now the rest of your code only needs to handle tuples

Second, although your sample code only handles the case of two slices, your real code has to handle two slices, or a slice and an int, or an int and a slice, or two ints, or a slice, or an int. If you can factor out the slice-handling code, you don't need to duplicate it over and over again.

One trick for handling int-vs.-slice is to treat [n] as a wrapper that does, in essence, [n:n+1][0], which lets you reduce everything even more. (It's a tiny bit trickier than this, because you have to special-case either negative numbers in general, or just -1, because obviously n[-1] != n[-1:0][0].) For 1-D arrays this may not be worth it, but for 2D arrays it probably is, because it means while you're dealing with the column, you've always got a list of rows rather than just a row.

On the other hand, you may want to share some code between __getitem__ and __setitem__… which makes some of these tricks either impossible or a lot harder. So, there's a tradeoff.

At any rate, here's an example that does all the simplification and pre/postprocessing I could think of (possibly more than you want) so that ultimately you're always looking up a pair of slices:

class Matrix(object):
    def __init__(self):
        self.m = [[row + col/10. for col in range(4)] for row in range(4)]
    def __getitem__(self, idx):
        if isinstance(idx, (numbers.Integral, slice)):
            idx = (idx, slice(None, None, None))
        elif len(idx) == 1:
            idx = (idx[0], slice(None, None, None))
        rowidx, colidx = idx
        rowslice, colslice = True, True
        if isinstance(rowidx, numbers.Integral):
            rowidx, rowslice = slice(rowidx, rowidx+1), False
        if isinstance(colidx, numbers.Integral):
            colidx, colslice = slice(colidx, colidx+1), False
        ret = self.m[rowidx][colidx]
        if not colslice:
            ret = [row[0] for row in ret]
        if not rowslice:
            ret = ret[0]
        return ret

Or it might be nicer if you refactored things along the other axis: Get the row(s), and then get the column(s) within it/them:

def _getrow(self, idx):
    return self.m[idx]

def __getitem__(self, idx):
    if isinstance(idx, (numbers.Integral, slice)):
        return self._getrow(idx)
    rowidx, colidx = idx
    if isinstance(rowidx, numbers.Integral):
        return self._getrow(rowidx)[colidx]
    else:
        return [row[colidx] for row in self._getrow(rowidx)]

This looks a whole lot simpler, but I'm cheating here by forwarding the second index to the normal list, which only works because my underlying storage is a list of lists. But if you have any kind of indexable row object to defer to (and it doesn't waste unacceptable time/space to create those objects unnecessarily), you can use the same cheat.


If you're objecting to the need to type-switch on the index parameter, yes, that does seem generally unpythonic, but unfortunately it's how __getitem__ generally works. If you want to use the usual EAFTP try logic, you can, but I don't think it's more readable when you have to try two different APIs (e.g., [0] for tuples, and .start for slices) in multiple places. You end up doing "duck-type-switching" up at the top, like this:

try:
    idx[0]
except AttributeError:
    idx = (idx, slice(None, None, None))

… and so on, and this is just twice as much code as normal type-switching without any of the usual benefits.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top