How much neural network theory required to design one? [closed]

https://datascience.stackexchange.com/questions/33255

31-10-2019
|

Question

So I have looked at some of the literature on neural networks and read some chapters, but the learning curve is so steep that I have had trouble even getting started on designing the neural network to solve my problem.

From what I understand, the architecture (or arrangement of neurons and their connections) should be designed according to the nature of the problem to be solved. Other parameters to be set according to the nature of the problem include the loss index (how the error will be calculated and if there should be a regularization term), whether or not there should be any scaling/unscaling, bounding, or conditions, and the training algorithm (such as the quasi-Newton method).

The particular type of problem I am interested in is using neural networks to figure out unknown functions (with unknown complexity) that input and output integers (as opposed to continuous values), given a large collection of inputs and outputs.

An example function takes 4 byte inputs and returns 2 byte outputs. This is done by first taking the first 2 bytes of input and XORing them with the last 2 bytes, to produce an intermediate result. This two byte value is then XORed with a copy of itself that is shifted left by 5 bits. This result is then XORed with a copy of itself shifted right by 7 bits. Then this result is then XORed with a copy of itself shifted left by 2 bits, and this value is outputted by the function, giving the final 2-byte result. Note: More than one unique input can produce the same output.

So given a large set of inputs and outputs of an unknown function, the neural network should then optimize itself to reproduce this function given new inputs. I am not sure how to get started designing this neural network, and I am not sure fully reading through neural networks textbooks is the optimal way to get started. I am using a software library meant for designing neural networks, and I can simply set the network architecture and the parameters described above. How much theory do I need to know in order to get started solving my problem? Where should I start with learning how to design this neural network?

EDIT: The main goal of all this is to use an existing tool to simplify producing an output (a 2-byte code, in the example above) given a new input, in a situation where the function is unknown. Neural networks seem to match the function-finding-through-trial-and-error characteristics that I need. This tool should be able to try all sorts of possible functions of increasing complexities in order to mimic the working of the actual unknown function.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange