Question

I am building a quiz application which pulls questions randomly from a pool of questions. However, there is a requirement that the pool of questions be limited to questions that the user has not already seen. If, however, the user has seen all the questions, then the algorithm should "reset" and only show questions the user has seen once. That is, always show the user questions that they have never seen or, if they have seen all of them, always show them questions they have seen less frequently before showing questions they have seen more frequently.

The list (L) is created in a such a way that the following is true: any value in the list (I), may exist once or be repeated in the list multiple times. Let's define another value in the list, J, such that it's not the same value as I. Then 0 <= abs(frequency(I) - frequency(J)) <= 1 will always be true.

To put it another way: if a value is repeated in the list 5 times, and 5 times is the maximum number of times any value has been repeated in the list, then all values in the list will be repeated either 4 or 5 times. The algorithm should return all values in the list with frequency == 4 before it returns any with frequency == 5.

Sorry this is so verbose, I'm struggling to define this problem succinctly. Please feel free to leave comments with questions and I will further qualify if needed.

Thanks in advance for any help you can provide.

Clarification

Thank you for the proposed answers so far. I don't think any of them are there yet. Let me further explain.

I'm not interacting with the user and asking them questions. I'm assigning the question ids to an exam record so that when the user begins an exam, the list of questions they have access to is determined. Therefore, I have two data structures to work with:

  • List of possible question ids that the user has access to
  • List of all question ids this user has ever been assigned previously. This is the list L described above.

So, unless I am mistaken, the algorithm/solution to this problem will need to involve list &/or set based operations using the two lists described above.

The result will be a list of question ids I can associate with the exam record and then insert into the database.

Was it helpful?

Solution 5

This is what I came up with:

from collections import Counter
import random

# the number of question ids I need returned to
# assign to the exam
needed = 3

# the "pool" of possible question ids the user has access to
possible = [1,2,3,4,5]

# examples of lists of question ids I might see that represent
# questions a user has already answered
answered1 = []
answered2 = [1,3]
answered3 = [5,4,3,2]
answered4 = [5,4,3,2,1,1,2]
answered5 = [5,4,3,2,1,1,2,3,4,5,1]
answered6 = [5,4,3,2,1]

def getdiff(answered):
    diff = set(possible) - set(answered)
    still_needed = needed - len(diff)
    if still_needed > 0:
        not_already_selected = list(set(possible) - diff)
        random.shuffle(not_already_selected)
        diff = list(diff) + not_already_selected[0:still_needed]
        random.shuffle(diff)
        return diff
    diff = list(diff)
    random.shuffle(diff)
    if still_needed == 0:
        return diff
    return diff[0:needed]

def workit(answered):
    """ based on frequency, reduce the list down to only
        those questions we want to consider "answered"
    """
    have_count = 0
    if len(possible) > len(answered):
        return getdiff(answered)
    counted = Counter(answered)
    max_count = max(counted.values())
    # the key here is to think of "answered" questions as
    # only those that have been seen with max frequency
    new_answered = []
    for value, count in counted.iteritems():
        if count == max_count:
            new_answered.append(value)
    return getdiff(new_answered)

print 1, workit(answered1)
print 2, workit(answered2)
print 3, workit(answered3)
print 4, workit(answered4)
print 5, workit(answered5)
print 6, workit(answered6)

"""
>>> 
1 [2, 4, 3]
2 [2, 5, 4]
3 [5, 2, 1]
4 [5, 3, 4]
5 [2, 4, 3]
6 [2, 3, 5]
>>> ================================ RESTART ================================
>>> 
1 [3, 1, 4]
2 [5, 2, 4]
3 [2, 4, 1]
4 [5, 4, 3]
5 [4, 5, 3]
6 [1, 5, 3]
>>> ================================ RESTART ================================
>>> 
1 [1, 2, 3]
2 [4, 2, 5]
3 [4, 1, 5]
4 [5, 4, 3]
5 [2, 5, 4]
6 [2, 1, 4]
"""

OTHER TIPS

Rewritten with the database stuff in fill-in pseudocode.

If I understand the problem correctly, I'd treat the questions (or their IDs as proxies) like a physical deck of cards: for each user, shuffle the deck and deal them a question at a time; if they want more than len(deck) questions, just start over: shuffle the deck into a new order and do it again. When a question is seen for the nth time, all other questions will have been seen n or n-1 times.

To keep track of what questions the user has had available, we put the unused question IDs back in the database and increment the "how many times through" counter whenever we need a new deal.

Something like:

from random import shuffle

def deal():
    question_IDs = get_all_questions(dbconn) # all questions
    shuffle(question_IDs)
    increment_deal_count(dbconn, userID) # how often this student has gotten questions
    return question_IDs


count_deals = get_stored_deals(dbconn, userID) # specific to this user
if count_deals: 
    question_IDs = get_stored_questions(dbconn, userID) # questions stored for this user 
else: # If 0 or missing, this is the first time for this student
    question_IDs = deal()


while need_another_question(): #based on exam requirements
    try:
        id = question_IDs.pop()
    except IndexError:
        question_IDs = deal()
        id = question_IDs.pop() # Trouble if db is ever empty. 

    use_question(id) # query db with the ID, then put question in print, CMS, whatever

# When we leave that while loop, we have used at least some of the questions
# question_IDs lists the *unused* ones for this deal
# and we know how many times we've dealt.

store_in_db(dbconn, userinfo, question_IDs)
# If you want to know how many times a question has been available, it's
# count_deals - (ID in question_IDs)
# because True evaluates to 1 if you try to subtract it from an integer. 

Why not have two lists, one for questions not yet picked and one for questions that have been picked. Initially, the not-yet-picked list will be full, and you will pick elements from it which will be removed and added to the picked list. Once the not-yet-picked list is empty, repeat the same process as above, this time using the full picked list as the not-yet-picked list and vice versa.

To implement your algorithm, you only need to shuffle the list and pass through it, and when done, repeat.

No need to copy the list or juggle items between two lists, just use the following control flow, for example:

import random

def ask_questions(list_of_questions):
    while True:
        random.shuffle(list_of_questions)
        for question in list_of_questions:
            print(question)
            # Python 3 use input not raw_input
            cont = raw_input('Another question?') 
            if not cont:
                break
        if not cont:
            break

Let's define a "pivot" that separates a list into two sections. The pivot partitions the array such that all numbers before the pivot has been picked one more than the numbers after the pivot (or more generally, all numbers before pivot are ineligible for picking, while all numbers after the pivot are eligible for picking).

You simply need to pick a random item from the list of numbers after the pivot, swap it with the number on the pivot, and then increment the pivot. When the pivot reaches the end of the list, you can reset it back to the start.

Alternatively, you can also use two lists, which is much easier to implement, but is slightly less efficient as it needs to expand/shrink the list. Most of the time, the ease of implementation would trump the inefficiency so the two-list would usually be my first choice.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top