OpenCL select() function with double

https://stackoverflow.com/questions/20936092

24-09-2022
|

Question

I'm porting some complex engineering code to OpenCL and have run into a problem with the select() ternary function with doubles. I'm just using scalars for now so I could use the simple C ternary operator ()?: but I plan to move to vector types soon.

My problem is that select with doubles requires a (long) type as the comparison but the scalar relational functions (e.g., isgreater) only return (int) for doubles. The prototypes for these functions are ...

int   isgreater (double a, double b);
longn isgreater (doublen a, doublen b);

double  select (double a, double b, long cmp);
doublen select (doublen a, doublen b, longn cmp);

I can get the scalar code to compile/run in scalar mode only if I cast the results of isgreater() as a long since select requires the element types to by the same size.

double hi = ...;
double lo = ...;
double res = select (lo, hi, (long)isgreater(T, T_cutoff));

Otherwise, I get a compiler error since select is ambiguous. There seems to be a mismatch in the specification regarding the relational mask types for scalar and vector doubles.

Q1: Is this an oversight in the specification or a bug in the implementation? Both the Intel and AMD OpenCL compilers fail for builds on the CPU so I'm guessing is the former.

Q2: OpenCL scalar relational functions return 0/1 and vector relational functions return 0/-1 (that is, all bits set). The (int)->(long) conversion appears to be consistent with this requirement but not (int)->(ulong), right? Is the (int)->(long) conversion costly?

Q3: When (if) I switch to vector doubles, will the compiler toss out the unnecessary explicit conversion? I want to retain both scalar and vector types so I can target CUDA GPUs and SIMD devices (MIC, CPUs) w/o having to keep two massive code sets.

Thanks for any advice here.

La solution

Q1: I'd say that not implicitly converting the result of isgreater into long is an oversight in the specification.

In the single element case select should work exactly like ternary operator. That's also the reason isgreater returns 1 in scalar case. Basically isgreater should work exactly like > does when scalar operators are used.

In the vectorized case select looks at the MSB, which is the reason isgreater returns -1 (All bits 1 so MSB is naturally 1 too).

Q2: Int long conversion shouldn't be costly at all. At most it just requires 1 additional instruction.

Q3: It does not.

This issue annoyingly prevents one from making a code that vectorizes from 1 to n elements, it requires special handling for the scalar case.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow