Your main function is fine. However either your training vectors or your backpropagation code is not (assuming your network is big enough to learn this). So this is going to be a bunch of question instead of an answer, but you may get the right idea:
- How many samples does your training vector include?
- Are those samples roughly classified half/half or is there a bias?
- Are there identical training samples that are classified ambiguously?
- How is the error calculated? Abs/Sqr average?
- Do you randomize the initial network weights?
- What is the initial error before training?
- Does the error change in the first iteration?
- Can you post the code on pastebin?