How many units does the hidden layer have?

Question 1

Roughly speaking:

more linear problem => less hidden nodes, more non-linear => more hidden nodes.

more generalisation => less hidden nodes, less generalisation => more hidden nodes

accurate answer (at least for your training set) => more hidden nodes, approximate answer => less hidden nodes

FYI: in the case of xor, if both inputs are connected straight to the output then a single additional hidden node is required. If no input to output connections are allowed then two hidden nodes will be the minimum.

In answer to the question is there a formula giving the exact number of hidden nodes for problems in general - no.

Question 2

No

The short, but correct, answer is that there isn't any definition of "the right amount" of hidden nodes in a layer. There are a few guide lines though, such as not using more hidden nodes in a given layer than there are input singals.

Configuring your network

The bottom line is that you have to calibrate the number of hidden nodes according to your particular dataset or problem instance. It is important to remeber that using as few hidden nodes as possible is favorable as this will ensure the network is generalized.