hadoop distribute partitions to reducer

Question 1

Hadoop doesn't lend itself to this kind of control.

as Explained by pg 43-44 of this excellent book. The programmer has little control over:

Where a mapper or reducer runs (i.e., on which node in the cluster).
When a mapper or reducer begins or finishes.
Which input key-value pairs are processed by a specific mapper.
Which intermediate key-value pairs are processed by a specific reducer. (what you would like)

BUT

You can change number 4 by implementing a cleverly designed custom Partitioner that splits your data just the way you want it so that it and distributes your load across reducers as expected. Check out how they implement a custom partitioner to calculate relative frequencies in chapter 3.3.

Question 2

The portioning is done for the reducers. As many partitions are created as the number of reducers chosen. You can choose the number of reducers by

job.setNumReduceTasks(n);

The number n need not be limited by the physical reducer number you have. There will only be some wait to get the next reduce slot. In your partitioner code, you can implemengt the logic required to assign a key to specific partition.

However I do not see achieving any efficiency by going beyond the number of physically available reducer slots as it will only result in wait for the next reduce slot.