The code you are compiling requires a static allocation of 28880 bytes (0x70d0) of shared memory per block. For compute capability 2.x and newer GPUs, this is no problem because they support up to 48kb of shared memory. However, for compute capability 1.x devices, the shared memory limit is 16kb (and up to 256 bytes of that can be consumed by kernel arguments). Because of this, the code cannot be compiled for compute 1.x devices and the compiler is generating an error telling you this. So the error comes from specifying sm_13/compute_13
to compiler. You can removed that and the build should work.
However, it gets worse. The Tesla C1060 is a compute capability 1.3 device. As a result, you will not be able to compile and run those kernels on your GPUs. There is no solution short of omitting those kernels from the build (if you don't need them), or back porting the code to the compute 1.x architecture. I have no idea whether that is feasible or not. Or finding more modern hardware to run the code on.