Question

I am starting to code my own machine learning package, adopting ILNumerics.

I will definitely get into Neural Nets, SVM, kernel methods and then I will start moving into a more bayesian frameworks.

I know that ILNumerics already offers some 'machine learning toolbox', but I would like to add my contribution and code my own algorithms, also because some of the features are not present (yet?)

First, if I understand well, as of now no optimization package is included. I hope I am wrong; if this is not the case, any suggestions on how to implement it through ILNumerics would be highly appreciated; for this I simply mean: mixing pre-existing code would impact performance? Is it advisable to mix ILArray and other vectors/matrices? Adhering to the recommendations provided in the quick guide would be enough to leverage on the excellent performance?

Or, if you prefer, would you recommend any pre-built available optimization package/library to be used in conjunction with ILNumerics?

Thanks a lot for any hints/advice, recommendations as usual,

GL

Was it helpful?

Solution

you are right in all points. There is currently no optimization package available in ILNumerics. However, as you know, one big advantage of .NET is the ease of incorporation of external packages. Several options exist here:

PInvoke (native modules)

Since most existing optimization packages exist as native modules, PInvoke is you friend. Several tools exist for automatic DLLImport signature generation. Personally, I prefer to create those signatures manually. Especially, since most scientific packages expose a simple signature which is very easily incorporated into .NET. Hoever, problems can arise for callbacks from native to managed code and for the marshalling of complex structs. (SO and we will help you to solve them all...

.NET modules

You may find an existing .NET optimization module. See this post (Free Optimization Library in C#) or try Microsoft Solver Foundation. Better modules may already exist - I haven't looked for a while. Unless the implementation was done very carefully, they may or may not suffer on performance due to poor (none) memory management. (As far as I know, no other project tracks memory the same efficient way as ILNumerics does?). However, interfacing those libraries would be easy enough: No need for DLLImport signatures. But in order to profit from the ILNumerics memory management, you will have to manage the arrays memory on the 'ILNumerics' side. So, the pattern to give some System.Array to some other .NET function would be:

 .... inside ILNumerics function
 using (ILScope.Enter(inparameter1,inparameter2)) {
    ....
    ILArray<double> A = zeros(1000,1000);  // allocate memory for external use
    var aArray = A.GetArrayForWrite();     // fetch reference to underlying System.Array  
    callOtherLib(aArray);                  // let other lib use and fill the array
    // proceed normally with A... 
    return A + 1 * 2 ... ; 
 }

If the other lib is reading from the given array only, A.GetArrayForRead() may give better performance. By using that scheme, most efficient memory usage is ensured - at least on the ILNumerics side of your implementation.

Mixing data structures from both sides does not do any harm - but usually doesn't bring much advantage either: often this diminishes convenient syntax, since there are no combined operators for mixed matrix implementations. Also, you will often be forced to break down your matrix accesses to elementwise operations which potentially lead to a less performant solution. So, I recommend a modular design with clearly seperated APIs.

The memory scheme above is also applicable (and recommended) for interfacing native libs.

Using ILNumerics only

Another way - of course - is to reimplement some module on your own, using ILNumerics builtin functions and array features. This way is obligatory, in order for any packages getting incorporated into the official ILNumerics distribution. It brings several advantages: One can utilize convenient ILNumerics syntax, automatically profits from the efficient ILNumerics memory management and the code will be fully platform independent at the end. Also, this gives you the most flexibility regarding the needed features for your algorithm.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top