Maximum number of similar groups of a given size that can be made from a given array
Question
I am given an array of numbers, not necessarily unique, and the size of a group. Let the array be denoted by $B$ and the size of the group be $A$.
I need to find the maximum number of groups with the exact same contents and of size $A$ that can be made from the elements of $B$. Once an element is used in a group it is exhausted. A group can have more than one element with the same value as long as the element is still available in $B$.
Example:
- If the input array is $\{1, 2, 3, 1, 2, 3, 4\}$, and the group size is $3$ the maximum number of groups that can be made is $2$, which are $\{1, 2, 3\}$ and $\{1, 2, 3\}$.
- If the input array is $\{1, 3, 3, 3, 6, 3, 10\}$, and the group size is $4$ the maximum number of groups that can be made is $1$, which is $\{1, 3, 3, 3\}$.
What I have tried so far, is to frame some equations ( given below ) but after that, I am struggling to come up with an algorithm to solve them.
Let $F_1$ be the frequency of the element $B_1$, $F_2$ be the frequency of the element $B_2$ and so on till $B_n$, where $B_1 \dots B_n$ are distinct elements from the array $B$.
Now I need to choose $n_1, n_2, \dots n_i$ such that
- $n_1 + n_2 + \dots + n_i = A$
- $k\cdot n_1 \leq F_1\text{ , } k\cdot n_2 \leq F_2\text{ , }\dots \text{ , }k\cdot n_i \leq F_i$
- $k$ is the number of groups and we need to maximize it.
Length of $B$ can be as large as $10^5$ and $A$ can also be as large as $10^5$.
Please help me find a greedy or dynamic approach to the problem.
Solution
The function $$f:\mathbb{N}\rightarrow\{0, 1\}:f(k) = \begin{cases} 1; &\text{if there is a solution of size $k$,}\\ 0; &\text{otherwise} \end{cases} $$ is monoton, since if there is no solution of size $k$ then there is no solution of size $k+1$. That means we can binary search the value of $k$ in the interval $[1, |B|]$, and output the greatest $k$ for which there is a solution of size $k$. Thereby, we turned the problem into a decision problem with an $O(\log |B|)$ factor in the running time.
For a fixed value of $k$, a greedy solution suffices. If we have $F_i$ copies of some element $B_i$, then each set can contain at most $\left\lfloor\frac{F_i}{k}\right\rfloor$ of this element. So we have a solution of size $k$ if and only if $$\sum\limits_{i=1}^{n} \left\lfloor\frac{F_i}{k}\right\rfloor \geq A.$$
As an exercise, try to prove that the greedy solution is optimal. The running time can be bounded in $O(|B|\log|B|).$