What I want is to be able to handle sets of data that have a fixed set of keys. All keys are strings. The data will never be edited. I know this can be done with normal dicts like so:
data_a = {'key1': 'data1a', 'key2': 'data2a', 'key3': 'data3a'}
data_b = {'key1': 'data1b', 'key2': 'data2b', 'key3': 'data3b'}
data_c = {'key1': 'data1c', 'key2': 'data2c', 'key3': 'data3c'}
They must be able to be called like so:
data_a['key1'] # Returns 'data1a'
However, this looks to be a waste of memory (since dictionaries apparently keep themselves 1/3 empty or something like that, along with also storing the keys multiple times) and also tedious to create as well since I need to keep entering the same keys over and over again in my code. I also risk accidentally changing something in the datasets.
My current solution is to have a set of keys stored in a tuple first, then store the data as tuples too. It looks like this:
keys = ('key1', 'key2', 'key3')
data_a = ('data1a', 'data2a', 'data3a')
data_b = ('data1b', 'data2b', 'data3b')
data_c = ('data1b', 'data2c', 'data3c')
To retrieve data, I would do this:
data_a[keys.index('key1')] # Returns 'data1a'
Then, I learned about this thing called namedtuples which seem to be able to do what I needed:
import collections
Data = collections.namedtuple('Data', ('key1', 'key2', 'key3'))
data_a = Data('data1a', 'data2a', 'data3a')
data_b = Data('data1b', 'data2b', 'data3b')
data_c = Data('data1b', 'data2c', 'data3c')
However, it appears I can't simply call the value by the key. Instead, to retrieve the data by the key, I have to use getattr, which doesn't seem very intuitive:
getattr(data_a,'key1') # Returns 'data1a'
My criteria is for memory efficiency first, then performance efficiency. Of these 3 methods, which would be the best way to do things? Or am I missing something and there's a more pythonic idiom to get what I want?
EDIT: I've now recently also learned about the existence of __slots__
, which apparently runs more efficiently for key:value pairs while pretty much consuming the same(?) amount of memory. Would an implementation acting similar to this be a suitable alternative to namedtuples?