Yes, the machine code is recompiled on the fly. This is called the JIT-compile step, and it will occur any time the machine code does not match the device that is being used (and assuming valid PTX code exists in the executable.)
You can learn more about JIT-compile here. Note the discussion of the cache which should alleviate the issue after the first run.
If you specify compilation for both sm_20 and sm_35, you can build a binary/executable that will run quickly on both types of devices, and you will also get notification if you are using a sm_35 feature that is not supported on sm_20 (during the compile process).