Question

Using systems such as Parallel Linq, it's possible to split up execution of anonymous functions, queries, etc across multiple cores and threads within a single machine. I'd like the ability to extend this to run across multiple machines using standard language constructs such as for loops (like Parallel.For()), value types like ints, structs, etc., and keep the application source modifications to a minimum. Ideally, this would allow me to open a project, add an attribute to a method, and recompile to gain access to the enhanced functionality.

It seems that I'd need something along the lines of:

  1. The ability to capture a compiled block of code (such as a lambda) and pass it to a worker process running on another node, along with any data that is required, or

  2. Provide a preprocessor that would capture the code in question, compile it in a sort of template project that would replace variable references, etc, with references to a class that would handle network communication, caching and access to any other required assets, and send the resulting DLL to any available worker nodes running on other machines.

Roslyn appears to provide some tools that would be useful here. Is there way to hook into the current compilation pipeline to allow this?

Edit

Okay, I know this is possible, because these guys did it. The question is, how?

Was it helpful?

Solution

You don't have to extend the language persay to do what Brahma does. He just implemented a custom query provider that parses expression trees and emits GPGPU code (LINQ to SQL does the same thing but with SQL).

I linked a basic guide on MSDN here that can get you up and running implementing an IQueryable provider.

The hard part will be traversing the expression trees and generating OpenCL code. Once you can do that you just pass it off to Cloo and you should be running.

Edit

You linked a tool that compiles standard .NET code to GPU code with an attribute [Kernel]. They do this by having a post-build tool look for the attributes in the compiled IL and they perform IL rewriting to generate GPU calls. This is similar to PostSharp, an AOP solution.

IL rewriting is time consuming and hard work but you could also go this route.

OTHER TIPS

Using systems such as Parallel Linq, it's possible to split up execution of anonymous functions, queries, etc across multiple cores and threads within a single machine. I'd like the ability to extend this to run across multiple machines using standard language constructs such as for loops (like Parallel.For()), value types like ints, structs, etc., and keep the application source modifications to a minimum.

Sounds great. In fact we have a system very much like that over in Microsoft Research, though obviously I cannot discuss the details.

I need the ability to capture a compiled block of code (such as a lambda) and pass it to a worker process running on another node, along with any data that is required

OK, you've got it. We added that feature to C# 3. That's how LINQ to SQL works. Somehow the LINQ query has to get onto the database. The compiled lambda is interrogated on the client machine, transformed into a query which is sent to the server node, and then the result is sent back.

Roslyn appears to provide some tools that would be useful here. Is there way to hook into the current compilation pipeline to allow this?

That's not the purpose of Roslyn; Roslyn is not about adding new features to the C# language. It's about making it easier to analyze code to build things like refactoring engines.

You don't need to hook into the compilation pipeline. PLINQ doesn't change the compiler, LINQ to SQL doesn't change the compiler, and so on. When you convert a lambda to an expression tree the compiler emits code that creates an expression tree at runtime that represents the lambda. You can interrogate that expression tree, serialize it across to another machine in your network, deserialize it, turn it into a delegate and run it if that's the kind of thing you enjoy doing.

You'll need to write your own expression tree serializer and deserializer probably, but they are pretty straightforward data structures. Being an immutable tree should make them pretty easy to serialize and deserialize; they can't really form complex networks since they are always constructed from leaf nodes up.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top