You have already reduced the problem to considering one column at a time, so let us start with that.
Instead of considering particular operations that can happen in a column, let us look at the process in general. Initially, we have the given column. At the end, we have the resulting column. What's left of the given column in the resulting column? It is a subsequence (since we can remove a box from anywhere) of the given column which transferred to a prefix of the resulting column ("prefix" as in "located at the bottom", since we can add new boxes only on top of what was there initially).
Naturally, we want to maximize the length of this sequence (subsequence or prefix, depending on where you look at). That sure looks like a dynamic programming problem, similar to edit distance or longest common subsequence. Perhaps you will want to work out the details yourself from this point. Good luck!