Question

Pascal's rule on counting the subset's of a set works great, when the set contains unique entities.

Is there a modification to this rule for when the set contains duplicate items?

For instance, when I try to find the count of the combinations of the letters A,B,C,D, it's easy to see that it's 1 + 4 + 6 + 4 + 1 (from Pascal's Triangle) = 16, or 15 if I remove the "use none of the letters" entry.

Now, what if the set of letters is A,B,B,B,C,C,D? Computing by hand, I can determine that the sum of subsets is: 1 + 4 + 8 + 11 + 11 + 8 + 4 + 1 = 48, but this doesn't conform to the Triangle I know.

Question: How do you modify Pascal's Triangle to take into account duplicate entities in the set?

Was it helpful?

Solution

It looks like you want to know how many sub-multi-sets have, say, 3 elements. The math for this gets very tricky, very quickly. The idea is that you want to add together all of the combinations of ways to get there. So you have C(3,4) = 4 ways of doing it with no duplicated elements. B can be repeated twice in C(1,3) = 3 ways. B can be repeated 3 times in 1 way. And C can be repeated twice in C(1,3) = 3 ways. For 11 total. (Your 10 you got by hand was wrong. Sorry.)

In general trying to do that logic is too hard. The simpler way to keep track of it is to write out a polynomial whose coefficients have the terms you want which you multiply out. For Pascal's triangle this is easy, the polynomial is (1+x)^n. (You can use repeated squaring to calculate this more efficiently.) In your case if an element is repeated twice you would have a (1+x+x^2) factor. 3 times would be (1+x+x^2+x^3). So your specific problem would be solved as follows:

(1 + x) (1 + x + x^2 + x^3) (1 + x + x^2) (1 + x)
  = (1 + 2x + 2x^2 + 2x^3 + x^4)(1 + 2x + 2x^2 + x^3)
  = 1    + 2x   + 2x^2 +  x^3 +
    2x   + 4x^2 + 4x^3 + 2x^4 +
    2x^2 + 4x^3 + 4x^4 + 2x^5 +
    2x^3 + 4x^4 + 4x^5 + 2x^6 +
    x^4  + 2x^5 + 2x^6 +  x^7
  = 1 + 4x + 8x^2 + 11x^3 + 11x^4 + 8x^5 + 4x^6 + x^7

If you want to produce those numbers in code, I would use the polynomial trick to organize your thinking and code. (You'd be working with arrays of coefficients.)

OTHER TIPS

A set only contains unique items. If there are duplicates, then it is no longer a set.

Yes, if you don't want to consider sets, consider the idea of 'factors.' How many factors does:

p1^a1.p2^a2....pn^an

have if p1's are distinct primes. If the ai's are all 1, then the number is 2^n. In general, the answer is (a1+1)(a2+1)...(an+1) as David Nehme notes.

Oh, and note that your answer by hand was wrong, it should be 48, or 47 if you don't want to count the empty set.

You don't need to modify Pascal's Triangle at all. Study C(k,n) and you'll find out -- you basically need to divide the original results to account for the permutation of equivalent letters.

E.g., A B1 B2 C1 D1 == A B2 B1 C1 D1, therefore you need to divide C(5,5) by C(2,2).

Without duplicates (in a set as earlier posters have noted), each element is either in or out of the subset. So you have 2^n subsets. With duplicates, (in a "multi-set") you have to take into account the number the number of times each element is in the "sub-multi-set". If it m_1,m_2...m_n represent the number of times each element repeats, then the number of sub-bags is (1+m_1) * (1+m_2) * ... (1+m_n).

Even though mathematical sets do contain unique items, you can run into the problem of duplicate items in 'sets' in the real world of programming. See this thread on Lisp unions for an example.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top