How can I plot the spectrogram of a wav file without using predefined functions?

Question 1

Here is how you can compute the DFT for an arbitrary waveform. It's slow but it works. I have updated it so you can even see how to normalize (both amplitude and frequency).

Note - THIS IS SLOW. The "Fast Fourier Transform" got its name for a reason... By cleverly rearranging the calculation, it performs the same calculation as the one below, but much faster.

n = 4096;
waveform = 0.2*rand(1, n); % a bit of background noise...

alpha = 0.54; beta = 1 - alpha; % hamming coefficients
hwin = alpha - beta * cos(2 * pi * (1:n) / (n - 1)); % from http://en.wikipedia.org/wiki/Window_function#Hamming_window
fSample = 8192; % sampling frequency

t = (0.5:1:n) / fSample; % actual time

% add some spectral components to the noise:
f1 = 440;    % Hz - A on the piano
f2 = 261.62; % Hz - middle C on the piano
waveform = waveform + 10 * cos(2 * pi * f1 * t) + 3 * cos(2 * pi * f2 * t);

waveformHW = waveform .* hwin; % the windowed form of the waveform

frequencies = (0:(n/2)-1) * fSample / n;
myFFT = zeros(size(frequencies));
for fi = 1:n/2
  c = cos(2*pi*frequencies(fi) * t);
  s = sin(2*pi*frequencies(fi) * t);
  myFFT(fi) = (sum(c .* waveformHW) + 1i * sum(s .* waveformHW))/sum(c.*c.*hwin);
end
figure; semilogy(frequencies, abs(myFFT))
xlim([0 1000])
xlabel 'frequency'
ylabel 'amplitude'
title 'two tones plus noise'

Now tested and (believed to be) working:

enter image description here

As you can see in the picture, the tone that falls exactly in a bin (440 Hz) has exactly the right amplitude. The one that falls between two bins (261.62 Hz) looks too low - this is because the energy ends up spread among several bins in the spectrogram. There is an initial fast fall-off, and then a rather broad (if very low amplitude) residual. This is called "spectral leakage" - you will notice that it doesn't occur on the 440 Hz signal which is exactly on top of a bin. This is an inevitable consequence of the way this calculation is done. One nice thing about the DFT is that you do not need the frequencies to be evenly spaced - so you could, for example, compute this with the "exact frequencies of the piano", and this "partial binning" problem would go away.

But using the FFT is much faster.

Note also that the scale is now correct - both for amplitude and frequency. You can adjust the number of samples and the sampling frequency, and the scaling will follow along.

Do check that the math is right... don't use this for "real work", but only to understand the underlying principles.

Did I mention that this is much slower than an FFT?

Question 2

n=256;
wholeFile=wavread('C:\Users\alireza\Desktop\alireza bahrami\New folder\Sound clip     32.wav');

for ii = 1:4096
waveform = wholeFile(128*ii + (1:256)); % a bit of background noise...

alpha = 0.54; beta = 1 - alpha; % hamming coefficients
hwin = alpha - beta * cos(2 * pi * (1:n) / (n - 1)); % from http://en.wikipedia.org  /wiki/Window_function#Hamming_window
fSample = 8192; % sampling frequency

t = (0.5:1:n) / fSample; % actual time

% add some spectral components to the noise:
%f1 = 440;    % Hz - A on the piano
%f2 = 261.62; % Hz - middle C on the piano
%waveform = waveform + 10 * cos(2 * pi * f1 * t) + 3 * cos(2 * pi * f2 * t);

waveformHW = waveform .* hwin; % the windowed form of the waveform

frequencies = (0:(n/2)-1) * fSample / n;
myFFT = zeros(size(frequencies));
for fi = 1:n/2
c = cos(2*pi*frequencies(fi) * t);
s = sin(2*pi*frequencies(fi) * t);
myFFT(fi) = (sum(c .* waveformHW) + 1i * sum(s .* waveformHW))/sum(c.*c.*hwin);
end
end
figure; semilogy(frequencies, abs(myFFT))
xlim([0 1000])
xlabel 'frequency'
ylabel 'amplitude'
title 'two tones plus noise'