From the MATLAB documentation:
[S,F,T] = spectrogram(...) returns a vector of frequencies, F, and a vector of times, T, at which the spectrogram is computed.
S, F and T are exactly what you need. The T variable contains the times at which the wav file contains frequencies F with the corresponding STFT in variable S. On a logarithmic scale (for arguably better visibility of frequency content), you calculate Z=log10(abs(s));
.
X and Y are used to create the mesh plot, but if you want to know: they contain T and F in matrix form with T on each row of X and F on each column of Y.