Question

I'm doing some work with CERN's pyROOT module, and I'm trying to store an array of strings as a leaf in a binary tree. In order to do so, I have to pass it an array, obviously, using not lists or dictionaries, but the array module. The module supports standard C arrays, of characters, integers, and so forth, but does anyone know of a way I can nest them in order to have an array of strings, or, effectively, an array of character arrays? Or have I gone too far and I need to take a step back from the keyboard for a while :)?

Code:

import ROOT

rowtree = ROOT.TTree("rowstor", "rowtree")

ROOT.gROOT.ProcessLine(
    "struct runLine {\
    Char_t test[20];\
    Char_t test2[20];\
    };" );
from ROOT import runLine
newline = runLine()
rowtree.Branch("test1", newline, "test/C:test2")

newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]

rowtree.Fill()

Error:

python branchtest
Traceback (most recent call last):
  File "branchtest", line 14, in <module>
    newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]
TypeError: expected string or Unicode object, list found

I'm wondering if it's possible to turn the list shown in this example into an array of strings.

Was it helpful?

Solution

A char array and a Python list of Python strings are two very different things.

If you want a branch containing a char array (one string) then I suggest using Python's built-in bytearray type:

import ROOT
# create an array of bytes (chars) and reserve the last byte for null
# termination (last byte remains zero)
char_array = bytearray(21)
# all bytes of char_array are zeroed by default here (all b'\x00')

# create the tree
tree = ROOT.TTree('tree', 'tree')
# add a branch for char_array
tree.Branch('char_array', char_array, 'char_array[21]/C')
# set the first 20 bytes to characters of a string of length 20
char_array[:21] = 'a' * 20
# important to keep the last byte zeroed for null termination!
tree.Fill()
tree.Scan('', '', 'colsize=21')

The output of tree.Scan('', '', 'colsize=21') is:

************************************
*    Row   *            char_array *
************************************
*        0 *  aaaaaaaaaaaaaaaaaaaa *
************************************

So we know the tree is accepting the bytes correctly.

If you want to store a list of strings, then I suggest using a std::vector<std::string>:

import ROOT

strings = ROOT.vector('string')()

tree = ROOT.TTree('tree', 'tree')
tree.Branch('strings', strings)
strings.push_back('Hello')
strings.push_back('world!')
tree.Fill()
tree.Scan()

The output of tree.Scan() is:

***********************************
*    Row   * Instance *   strings *
***********************************
*        0 *        0 *     Hello *
*        0 *        1 *    world! *
***********************************

In a loop you would want to strings.clear() before filling with a new list of strings in the next entry.

Now, the rootpy package (also see the repository on github) provides a better way of creating trees in Python. Here is an example of how you can use char arrays in a "friendlier" way with rootpy:

from rootpy import stl
from rootpy.io import TemporaryFile
from rootpy.tree import Tree, TreeModel, CharArrayCol

class Model(TreeModel):
    # define the branches you want here
    # with branchname = branchvalue
    char_array = CharArrayCol(21)
    # the dictionary is compiled and cached for later
    # if not already available
    strings = stl.vector('string')

# create the tree inside a temporary file
with TemporaryFile():
    # all branches are created automatically according to your model above
    tree = Tree('tree', model=Model)

    tree.char_array = 'a' * 20
    # attemping to set char_array with a string of length 21 or longer will
    # result in a ValueError being raised.
    tree.strings.push_back('Hello')
    tree.strings.push_back('world!')
    tree.Fill()
    tree.Scan('', '', 'colsize=21')

The output of tree.Scan('', '', 'colsize=21') is:

***********************************************************************
*    Row   * Instance *            char_array *               strings *
***********************************************************************
*        0 *        0 *  aaaaaaaaaaaaaaaaaaaa *                 Hello *
*        0 *        1 *  aaaaaaaaaaaaaaaaaaaa *                world! *
***********************************************************************

See another example of using TreeModels with rootpy here:

https://github.com/rootpy/rootpy/blob/master/examples/tree/model_simple.py

OTHER TIPS

You've defined the test member of a runLine as an array of 20 chars:

Char_t test[20];\

But then you're trying to pass it a list of two strings:

newline.test = ["AbcDefgHijkLmnOp","aaaaaaaaaaaaaaaaaaa"]

This doesn't make any sense in C (or CINT) or in Python, so of course it doesn't make any sense in PyROOT either.

Also, there seems to be a lot of confusion in your question. You say you need to pass PyROOT "an array, obviously, using not lists or dictionaries, but the array module"… but PyROOT doesn't particularly care about the Python array module. You've tagged your question numpy, which implies that you may be thinking of numpy rather than array as "the array module", but last time I checked (which, admittedly, is quite some time ago), they didn't interact together at all; you had to explicitly ask numpy to export buffers if you wanted something you could pass to PyROOT.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top