Rectangles are usually represented by pairs of min+max in each dimension. So the "upper" and "lower" values are the minimum and maximum.
The margin is the perimeter. The reason is that for many situations, squares are the preferrable type of rectangles. For example when you do Euclidean (or Manhattan, pretty much any Lp norm) nearest neighbor search. The reason is that they are to some extend "unbiased".
Other split strategies such as the "linear" split by Ang et Tan neglect this, and tend to produce very long and thing slices. Wikipedia has an example for this:
https://en.wikipedia.org/wiki/File:Zipcodes-Germany-AngTanSplit.svg
These are the kind of splits the R*-tree tries to avoid. Because most queries will intersect a lot of these slices, so you gain very little then.
Note that the R*-tree uses a number of heuristics and tie breakers. Furthermore, it does a two step decision: First it only chooses the axis to use for splitting. When the axis has been determined, it uses actually a different logic to choose a split along this axis.