OK, so I found the source of my problem and a possible solution, albiet not one in PyBrain.
The source of my problem was in the derivative of my custom cost (a.k.a performance, etc) function ... or lack thereof. The cost function being used appears to be:
0.5 * (error ** 2) # 1/2 the average square error
And the derivative of this is simply:
error
Since I was implementing a more complex error function with a more complex derivative, and I hadn't changed the hardcoded derivative (wherever it's supposed to go), gradient descent was unable to take reasonable steps down the error gradient.
The solution I found was to use neurolab, which makes it much easier to implement custom error functions in a modular way. Although some hacking was needed in the core files, I only needed to change roughly three or four lines of core code. (Specifically I modified ff_grad_step
in tool.py and the last line of the Train
class in core.py. I implemented my custom cost function by creating a new function in error.py, and making my network hook into it in net.py.)
I hope this wasn't too specific to my own problem for someone else in a similar situation, but this was a huge pain in the ass for something that can be so critical in learning a Neural Network!