There is no such kind of API support in VTune. Use Knights Corner Platform Analysis in VTune GUI or command line - it runs your program on host, but collects hardware counters only from the Xeon Phi card. I.e. as result you should see performance metrics for only offload code.
You may also find this article useful for interpreting results: http://software.intel.com/en-us/ARTICLES/OPTIMIZATION-AND-PERFORMANCE-TUNING-FOR-INTEL-XEON-PHI-COPROCESSORS-PART-2-UNDERSTANDING