Question

Given an Excel sheet with the following layout:

  A                    | B
1 Name of First Column | Name of Second Column
2 Value in First Column| Value in Second Column

xlrd sure has an odd way of indexing into it. First, some setup ...

import xlrd
f = open("example.xlsx")
wb = xlrd.open_workbook(file_contents=f.read())
sh = wb.sheet_by_index(0)

Let's see what's in the first row using row_values

print sh.row_values(0, start_colx=0, end_colx=1)

Result?

[u'Name of First Column']

What went wrong? The first, unlabeled, parameter of row_values is rowx. "'rowx' is a row index, counting from zero" And the two colx? "'colx' is a column index, counting from zero."

Both colx should be counting from zero, you might think. And if I specify in end_colx I'd like to end on column 1, I mean the second column, counting from zero.

Was it helpful?

Solution

xlrd's observed behavior reminds me of python's slicing notation, which is like this:

 +---+---+---+---+---+
 | H | e | l | p | A |
 +---+---+---+---+---+
 0   1   2   3   4   5
-5  -4  -3  -2  -1

(given a string of "HelpA" -- that's from the official docs)

So, if you start on 0 and end on 0 ([0:0]) you won't retrieve anything.

If anyone has a better answer, I'd love to hear it. Just wanted to document this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top