Question

I'm currently taking a paper on Big Data which has us utilising R heavily for data analysis. I happen to have a GTX1070 in my pc for gaming reasons. Thus, I thought it would be really cool if I could use that to speed up some of the processing for some of the stuff my lecturers have me doing, but it really doesn't seem easy to do this at all. I've installed gpuR, CUDA, Rtools, and a few other bits and bobs, and I can get it to create gpuMatrix objects from genomic expression data, for example, but I have yet to find a function which both works with the gpuMatrix objects and also provides any noticeable difference in performance. Perhaps this just relates to limitations inherent to the gpuR package - some other packages do seem to talk about having functions which sound like they would be more like the sort of thing I'm looking for, which brings me to the question:

Almost all of those packages are exclusively for Linux, is it particularly hard to implement GPU support for R in windows? Or is there some other reason that there are so few packages available to do this in Windows? In some sense I'm just curious, but it would also be very cool to really get it working. It surprises me that there is so little available for Windows, usually it's the other way around.

Was it helpful?

Solution

From my experience setting up GPU processing for R is hard, setting it up on a Windows machine is even harder. Additionally, GPU processing can only be used for very specific types of calculations.

If you just want to setup GPU processing for the sake of it, then my answer is quite useless.

If you however care about general performance optimization of your system and your code, I advise to check the following steps:

  • Use Microsoft R Open instead of Base R because it automatically enables multicore processing on your machine.

  • Vectorize your code

  • Use libraries such as data.table instead of dataframes

  • Avoid growing objects

In general, R's performance depends strongly on your code quality. A very good summary on what you can and should do is provided in R Inferno by Patrick Burns.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top