My question is very similar to this previously asked question:
Output whole line once for each unique value of a column (Bash)
but with one major difference. In his example:
pep> AEYTCVAETK 2 genes ADUm.1024,ADUm.5198,ADUm.750
pep> AIQLTGK 1 genes ADUm.1999,ADUm.3560
pep> AIQLTGK 8 genes ADUm.1999,ADUm.3560
pep> KHEPPTEVDIEGR 5 genes ADUm.367
pep> VSSILEDKTT 9 genes ADUm.1192,ADUm.2731
pep> AIQLTGK 10 genes ADUm.1999,ADUm.3560
pep> VSSILEDKILSR 3 genes ADUm.2146,ADUm.5750
pep> VSSILEDKILSR 2 genes ADUm.2146,ADUm.5750
The goal was to "print a line for each distinct value of the peptides in column 2, meaning the above input would become:"
pep> AEYTCVAETK 2 genes ADUm.1024,ADUm.5198,ADUm.750
pep> AIQLTGK 1 genes ADUm.1999,ADUm.3560
pep> KHEPPTEVDIEGR 5 genes ADUm.367
pep> VSSILEDKTT 9 genes ADUm.1192,ADUm.2731
pep> VSSILEDKILSR 3 genes ADUm.2146,ADUm.5750
But what I would like to do is to print one line for each unique entry in column 2, however I would like to print the line with the highest value in column 3, so the output would look like this:
pep> AEYTCVAETK 2 genes ADUm.1024,ADUm.5198,ADUm.750
pep> AIQLTGK 10 genes ADUm.1999,ADUm.3560
pep> KHEPPTEVDIEGR 5 genes ADUm.367
pep> VSSILEDKTT 9 genes ADUm.1192,ADUm.2731
pep> VSSILEDKILSR 3 genes ADUm.2146,ADUm.5750
Thanks in advance.