A (2,2,2) neural network can easily be trained for this. I just tried this in Encog, and it trained in the same amount of time as the single output version. Really what you have above is network configured as a one-of-n classification. That is, you have one output neuron for each expected value.
I am not real fluent in Python, but I would guess that it is a problem somewhere in the adaptation of the code. It is not an inherent limitation in the ANN.