List the most common lists, from a list

Question

As mentioned in one of the comments, you can simply use a tuple of tags instead of a list of them which will work with the Counter class in the collections module. Here's how to do that using the list-based approach of the code in your question, along with a few optimizations since you have to process a large number of POS tags:

from collections import Counter

GROUP_SIZE = 5
counter = Counter()
mylist = []

with open("tags.txt", "r") as tagfile:
    tags = (line.strip() for line in tagfile)
    try:
        while len(mylist) < GROUP_SIZE-1:
            mylist.append(tags.next())
    except StopIteration:
        pass

    for tag in tags:   # main loop
        mylist.pop(0)
        mylist.append(tag)
        counter.update((tuple(mylist),))

if len(counter) < 1:
    print 'too few tags in file'
else:
    for tags, count in counter.most_common(10):  # top 10
        print '{}, count = {:,d}'.format(list(tags), count)

However it would be even better to also use a deque from the collections module instead of a list for what you're doing because the former have very efficient, O(1), appends and pops from either end vs O(n) with the latter. They also use less memory.

In addition to that, since Python v 2.6, they support a maxlen parameter which eliminates the need to explicitly pop() elements off the end after the desired size has been reached -- so here's an even more efficient version based on them:

from collections import Counter, deque

GROUP_SIZE = 5
counter = Counter()
mydeque = deque(maxlen=GROUP_SIZE)

with open("tags.txt", "r") as tagfile:
    tags = (line.strip() for line in tagfile)
    try:
        while len(mydeque) < GROUP_SIZE-1:
            mydeque.append(tags.next())
    except StopIteration:
        pass

    for tag in tags:   # main loop
        mydeque.append(tag)
        counter.update((tuple(mydeque),))

if len(counter) < 1:
    print 'too few tags in file'
else:
    for tags, count in counter.most_common(10):  # top 10
        print '{}, count = {:,d}'.format(list(tags), count)