Question

I have some csv data and a piece of code that automatically removes rows within the csv. What I need is a pice of code that can re rank the csv in colum 10, column 10 is originally ranking column 11 however this was achieved in excel. Note in the data will always be organised lowest ranked to highest down the page, and note that the unique data that defines each dataset within the csv is in column 3. What I need is some code to re rank column 10 in accordance with column 3 defining each dataset after some rows have been removed.

1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,1,141.88,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,2,141.85,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,3,140.81,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,4,131.86,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,1,163.85,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,2,163.24,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,3,162.93,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,4,161.23,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,5,159.83,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,6,156.71,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,7,155.49,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,8,154.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,9,147.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,10,142.34,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,11,140.09,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,12,129.7,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,1,169.5,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,2,165.2,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,3,165.1,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,4,160.45,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,5,159.8,data data data

Upon running a piece of code I remove particular rows from the csv for instance it could now look as follows:

1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,2,141.85,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,3,140.81,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,4,131.86,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,1,163.85,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,3,162.93,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,4,161.23,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,5,159.83,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,6,156.71,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,8,154.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,9,147.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,11,140.09,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,1,169.5,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,2,165.2,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,3,165.1,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,5,159.8,data data data

We can now see as a result of removing these rows that the ranks for unique datat1 in column 10 reads 2,3,4 this needs adjusting to 1,2,3 similarly in unique data3 column 10 reads 1,2,3,5 this needs adjusting to 1,2,3,4 So the adjusted csv would read:

1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,1,141.85,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,2,140.81,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,3,131.86,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,1,163.85,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,2,162.93,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,3,161.23,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,4,159.83,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,5,156.71,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,6,154.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,7,147.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,8,140.09,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,1,169.5,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,2,165.2,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,3,165.1,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,4,159.8,data data data

Kind regards SMNALLY

Was it helpful?

Solution

Store a set of itertools.count() iterables in a dictionary, to keep a counter per unique column value. If you use collections.defaultdict(), you can have the count start at 1 automatically whenever you encounter a new unique value.

Your data is already sorted, so all you need to do is replace the 10th column:

import csv
from itertools import count
from collections import defaultdict
from functools import partial

counts = defaultdict(partial(count, 1))  # create a new count starting at 1

with open(output_csv_filename, 'wb') as outfile:
    writer = csv.writer(outfile)
    for row in your_list_of_rows:
        row[9] = next(counts[row[2]])  # get the next count value
        writer.writerow(row)

That's it. row[9] is the 10th column; it'll be filled with numbers starting at 1 for each unique value found in row[2] (the 3rd column).

Quick demo of the counters dictionary:

>>> from itertools import count
>>> from collections import defaultdict
>>> from functools import partial
>>> counts = defaultdict(partial(count, 1))
>>> next(counts['foo'])
1
>>> next(counts['foo'])
2
>>> next(counts['bar'])
1
>>> next(counts['foo'])
3

Running the above code on your sample data set results in:

1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,1,141.85,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,2,140.81,data data data
1-20,data1,Unique data1,4,data2,14,data3.65,data4,data5,3,131.86,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,1,163.85,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,2,162.93,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,3,161.23,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,4,159.83,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,5,156.71,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,6,154.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,7,147.96,data data data
1-30,data1,Unique data2,4,data2,12,data3.30,data4,data5,8,140.09,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,1,169.5,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,2,165.2,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,3,165.1,data data data
1-50,data1,Unique data3,2,data2,16,data3.00,data4,data5,4,159.8,data data data

OTHER TIPS

So, you want to rank your rows by 3rd then by 10th element, right?

Read file

ext = "C:\Users\Me\Desktop\\test.txt"
readL = []

f = open(ext)

for line in f:

    readL += [line.strip().split(',')]   

f.close()

sort list of lines by elements 3 then 10:

from operator import itemgetter
print sorted(readL, key=itemgetter(3,10))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top