If you want to follow the implementation steps, I suggest reading:
In regard your questions:
1) The JPEG standard knows nothing about color spaces and does not care whether you use RGB or YCbCr, or CMYK. There are several JPEG file format (e.g., JFIF, EXIF, ADOBE) that specify the color spaces--usually YCbCr.
The reason for using YCbCr is that if follows the JPEG trend of concentrating information. There tends to be more useful information in the Y component than the Cb or Cr components. Using YCbCr, you can sample 4 Ys for ever Cb and Cr (or even 16) for every Y. That reduces the amount of data to be compressed by 1/2.
Note that the JPEG file formats specify limits on sampling (JPEG allows 2:3 sampling while most implementations do not).
2) The DCT coefficients are Runlength encoded then huffman (or arithmetic) encoded. You have to use both.