First I would like to suggest this tutorial if you have not already seen it.
Yes, you have to apply DFT, Mel-filter bank, log and DCT to EACH AND VERY FRAME and then get the first 13 coefficients of DCT. The coefficients can be stored in an array of array of double (say vector< vector< double> > mfcc). Then each mfcc(i).size = 13 i.e. the first 13 coefficients of each frame. Therefore each mfcc(i) will contain the 13 coefficients of each frame and mfcc will be a vector of these 13 coefficients.
I would suggest you to use a c++ librry for mfcc extraction instead of doing things from scratch.