The problem with this code is that if I have n "constructs" (a.k.a. plasmids) that I'm trying to build by "Gibson" assembly, it will process the first n-1 plasmids, but not the last one.
This is actually a general problem, and the simplest way around it is to add a check after the loop, like this:
for row in construct_list:
do all your existing code
if we have a current Gibson list:
repeat the code to process it.
Of course you don't want to repeat yourself… so you move that work into a function, which you call in both places.
However, I'd probably write this differently, using groupby
. I know this will probably seem "way too advanced" at first glance, but it's worth trying to see if you can understand it, because it makes things a lot simpler.
def get_strategy(row):
return row[0]
for group in itertools.groupby(construct_list, key=get_strategy):
Now, you'll get each construct as a separate list, so you don't need the temp_list
at all. For example, the first group will be:
[[1, 'Gibson', 'P(OmpC)-cI::P(cI)-LacZ controller'],
[1, 'Gibson', 'P(OmpC)-cI::P(cI)-LacZ controller'],
[1, 'Gibson', 'P(OmpC)-cI::P(cI)-LacZ controller']]
The next will be:
[[2, 'iPCR', 'P(cpcG2)-K1F controller with K1F pos. feedback']]
And there won't be a left-over group at the end to worry about.
So:
for group in itertools.groupby(construct_list, key=get_strategy):
construct_strategy = get_strategy(group[0])
if construct_strategy == "Gibson":
# your existing code, using group instead of temp_list,
# and no need to maintain temp_list at all
elif construct_strategy == 'iPCR":
# your existing code, using group[0] instead of row
Once you get over the abstraction hurdle, it's a whole lot simpler to think about the problem this way.
In fact, once you start to grasp iterators intuitively, you'll start finding that itertools
(and the recipes on its docs page, and the third-party library more_itertools
, and similar code you can write yourself) turn a lot of complicated questions into very simple ones. The answer to "How do I keep track of the current group of matching rows within a list of rows?" is "Keep a temporary list, and remember to check it every time the group changes and then check again at the end for leftovers", but the answer to the equivalent question "How do I transform row iteration into row-group iteration?" is "Wrap the iterator in groupby
."
You also might want to add in an assert
or other check that all(row[1] == construct_strategy for row in group[1:])
, that len(group) == 1
in the iPCR
case, that there is no unexpected third strategy, etc., so when you inevitable run into an error, it'll be easier to tell whether it was bad data or bad code.
Meanwhile, instead of using a csv.reader
, skipping the first row, and referring to the columns by meaningless numbers, it might be better to use a DictReader
:
with open('constructs-to-make.csv', 'rU') as constructs:
primer_list = []
def get_strategy(row):
return row["Strategy"]
for group in itertools.groupby(csv.DictReader(constructs), key=get_strategy):
# same as before, but with
# ... row["Construct"] instead of row[0]
# ... row["Strategy"] instead of row[1]
# ... row["Name"] instead of row[2]