I am not sure I understand correctly your input. From what I gather, wavread reads .wav file as a "vector of amplitudes".
First of all having 4837 inputs, k-sized hidden layer, and 8 classes makes this network to have 4837*k + 8*k weights, which can be quite huge. Way too much for 800 training inputs. It is often agreed upon (but, it is more art than science) that hidden layer shouldn't be much smaller than the input layer.
I am also not sure why do want need a neural network if the logistic regression performed well.
Having those doubts I am not sure I am answering your question, but I will try. You need to decrease the input size. It can be done in many ways, one is wavelet/fourier analysis (which is casting one space to lower-dimensional). After doing the fourier analysis you can "bucket" different frequencies. Simpler way out is to do dimensionality reduction (one function in matlab, something like PCA). It is motivated by the fact that nearby values are very highly correlated. It is called "whitening" in image analysis.
Size of the hidden layer is very hard to estimate. The best way is to do experiments for different sizes of the hidden layer size and pick the best one (run a loop overnight and see results).