It depends on the structure of your new grids, and also your old.
Let's take the worst case. Normal rectangular grid (like an image) If every odd item is of type 1 and every even is of type 2. Now basically half of your threads will sit idle in GPU (While the type1 is being counted the type2 threads 'idle'). It's because the items within a workgroup generally share their program counter.
If your new grids are 2 kernel calls and simple "not of type2? return" then it's worse than the first case. However if you manage to make 2 grids on which every item is of the correct type then it's far better to split it.
If your original grid is image with exact 2 halves it probably doesn't matter. Only groups within the boundary will perform extra work.
Branches are not that evil. Just think it so that whenever you have a branch and even a single thread within a workgroup (or whatever is the unit of scheduling in your HW) takes a different direction from others all of the code in both branches will be taken everywhere.
That is also the reason why optimizations such as not performing an expensive computation if some special condition applies do not work in general on GPU, because if the other threads don't fullfill the condition you will still effectively calculate it in every thread.