1) Yes, each input gets its own node, and that node is always the node for that input type. The order doesn't matter - you just need to keep it consistent. After all, an untrained neural net can learn to map any set of linearly separable inputs to outputs, so there can't be an order that you need to put the nodes in in order for it to work.
2 and 3) You need to collect all the values from a single layer before any node in the next layer fires. This is important if you're using any activation function other than a stepwise one, because the sum of the inputs will affect the value that is propagated forward. Thus, you need to know what that sum is before you propagate anything.
4) Which nodes to connect to which other nodes is up to you. Since your net won't be excessively large and XOR is a fairly straightforward problem, it will probably be simplest for you to connect all nodes in one layer to all nodes in the next layer (i.e. a fully-connected neural net). There might be specialized cases in other problems where it would be better to not use this topology, but there isn't an easy way to figure it out (most people either use trial and error or a genetic algorithm, as in NEAT), and you don't need to worry about it for the purposes of this problem.