CUDA kernel printf of (int) -1 gives wrong output with %d specifier

Question 1

If you check appendix B.32.1 "Format Specifiers" of the CUDA C Programming Guide, you will find that the z modifier in %zd is not supported. You will have to cast to unsigned long and use %lu as format specifier:

printf("On DEV: Size: %lu, Minus One: %d\n",(unsigned long)(sp->size), (int)-1);

Question 2

The basic problem seems to be that the %zd format specifier isn't being honoured by the device printf (I am not sure, off the top of my head whether it is supported).

EDIT:

The documentation for printf in CUDA 5 says this:

As for standard printf(), format specifiers take the form: %[flags][width][.precision][size]type

The following fields are supported (see widely-available documentation for a complete description of all behaviors):
Flags: ‘#’ ‘ ‘ ‘0’ ‘+’ ‘-‘
Width: ‘*’ ‘0-9’
Precision: ‘0-9’
Size: ‘h’ ‘l’ ‘ll’
Type: ‘%cdiouxXpeEfgGaAs’
Note that CUDA’s printf()will accept any combination of flag, width, precision, size and type, whether or not overall they form a valid format specifier. In other words, “%hd” will be accepted and printf will expect a double-precision variable in the corresponding location in the argument list.

So the %zd format specifier for size_t isn't supported. Modifying your kernel like this:

__global__ void f(SpacePtr<int>* sp)
{
    const int minus_one = -1;
    printf("This _is_ a 'minus one' or ain't it: %d\n", minus_one);
    printf("On DEV: Size: %d, Minus One: %d\n",int(sp->size), minus_one);
}

works fine.

Also note you have a pretty major mistake in your host code, although that won't have any effect on the manifest behaviour in your example. You are only allocating and copying one byte for dev_ptr which is obviously incorrect. It should look something like:

cudaMalloc((void **)&devPtr, sizeof(data));
cudaMemcpy(devPtr, &data, sizeof(data), cudaMemcpyHostToDevice);

to transfer the full contents of data from host to device.