Question

Hi I am new to programming and want to learn python. I am working on a code that should return items that are most redundant in a list. If there are more than 1 then it should return all. Ex.

List = ['a','b','c','b','d','a'] #then it should return both a and b.
List = ['a','a','b','b','c','c','d'] #then it should return a b and c.
List = ['a','a','a','b','b','b','c','c','d','d','d'] #then it should return a b and d.

Note: We don't know what element is most common in the list so we have to find the most common element and if there are more than one it should return all. If the list has numbers or other strings as elements then also the code has to work

I have no idea how to proceed. I can use a little help.

Here is the whole program:

from collections import Counter

def redundant(List):
    c = Counter(List)
    maximum = c.most_common()[0][1]
    return [k for k, v in c.items()if v == maximum]

def find_kmers(DNA_STRING, k):
    length = len(DNA_STRING)
    a = 0
    List_1 = []
    string_1 = ""
    while a <= length - k:
        string_1 = DNA_STRING[a:a+k]
        List_1.append(string_1)
        a = a + 1
    redundant(List_1)

This program should take DNA string and length of kmer and find what are the kemers of that length that are present in that DNA string.

Sample Input:

ACGTTGCATGTCGCATGATGCATGAGAGCT
4

Sample Output:

CATG GCAT  
Was it helpful?

Solution

You can use collections.Counter:

from collections import Counter
def solve(lis):
    c = Counter(lis)
    mx = c.most_common()[0][1]
    #or mx = max(c.values())
    return [k for k, v in c.items() if v == mx]

print (solve(['a','b','c','b','d','a']))
print (solve(['a','a','b','b','c','c','d']))
print (solve(['a','a','a','b','b','b','c','c','d','d','d'] ))

Output:

['a', 'b']
['a', 'c', 'b']
['a', 'b', 'd']

A slightly different version of the above code using itertools.takewhile:

from collections import Counter
from itertools import takewhile
def solve(lis):
    c = Counter(lis)
    mx = max(c.values())
    return [k for k, v in takewhile(lambda x: x[1]==mx, c.most_common())]

OTHER TIPS

inputData = [['a','b','c','b','d','a'], ['a','a','b','b','c','c','d'], ['a','a','a','b','b','b','c','c','d','d','d'] ]
from collections import Counter
for myList in inputData:
    temp, result = -1, []
    for char, count in Counter(myList).most_common():
        if temp == -1: temp = count
        if temp == count: result.append(char)
        else: break
    print result

Output

['a', 'b']
['a', 'c', 'b']
['a', 'b', 'd']
>>> def maxs(L):
...   counts = collections.Counter(L)
...   maxCount = max(counts.values())
...   return [k for k,v in counts.items() if v==maxCount]
... 
>>> maxs(L)
['a', 'b']
>>> L = ['a','a','b','b','c','c','d']
>>> maxs(L)
['a', 'b', 'c']
>>> L = ['a','a','a','b','b','b','c','c','d','d','d']
>>> maxs(L)
['d', 'a', 'b']

Just for the sake of giving a solution not using collections & using list comprehensions.

given_list = ['a','b','c','b','d','a']
redundant = [(each, given_list.count(each)) for each in set(given_list) if given_list.count(each) > 1]
count_max = max(redundant, key=lambda x: x[1])[1]
final_list = [char for char, count in redundant if count == count_max]

PS - I myself haven't used Counters yet :( Time to learn!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top