Question

So it's late here, and my google skills seem to be failing me. I've found some great responses on SO before (time and time again), I thought you guys could help.

I have a neural network I'm trying to run in native objective-c. It works, but it's too slow. These networks are not recurrent. Each network I run about 20,000 times ( 128x80 times, or around that). The problem is these networks really just boil down to math functions (each network is a 4 dimensional function, taking x,y,dist(x,y),and bias as inputs, and outputting 3 values).

What I want to do is convert each network (only once) into a function call, or a block of code at runtime in objective-c.

How do I do this? I could make a big string of the math operations that need to be performed, but how do I go about executing that string, or converting the string into a block of code for execution?

Again, my late night search failed me, so sorry if this has already been answered. Any help is greatly appreciated.

-Paul

Edit: Aha! Great success! Nearly 24 hours later, I have working code to turn a neural network with up to 4 inputs into a single 4 dimensional function. I used the block method suggested by Dave DeLong in the answers.

For anybody who ever wants to follow what I've done in the future, here is a (quick) breakdown of what I did (excuse me if this is incorrect etiquette on stackoverflow): First, I made a few typedef's for the different block functions:

typedef CGFloat (^oneDFunction)(CGFloat x);
typedef CGFloat (^twoDFunction)(CGFloat x, CGFloat y);
typedef CGFloat (^threeDFunction)(CGFloat x, CGFloat y, CGFloat z);
typedef CGFloat (^fourDFunction)(CGFloat x, CGFloat y, CGFloat z, CGFloat w);

A oneDFunction takes the form of f(x), twoD is f(x,y), etc. Then I made functions to combine two fourDFunction blocks (and 2 oneD, 2 twoD, etc, although these were not necessary).

fourDFunction (^combineFourD) (fourDFunction f1, fourDFunction f2) =
  ^(fourDFunction f1,     fourDFunction f2){
    fourDFunction blockToCopy = ^(CGFloat x, CGFloat y, CGFloat z, CGFloat w){
        return f1(x,y,z,w) + f2(x,y,z,w);
    };
    fourDFunction act = [blockToCopy copy];
    [f1 release];
    [f2 release];
    //Need to release act at some point
    return act;            
};

And, of course, I needed to apply the activation function to the fourD function for every node, and for each node, I would need to multiply by the weight connecting it:

//for applying the activation function
fourDFunction (^applyOneToFourD)( oneDFunction f1, fourDFunction f2) = 
^(oneDFunction f1, fourDFunction f2){
    fourDFunction blockToCopy = ^(CGFloat x, CGFloat y, CGFloat z, CGFloat w){
        return f1(f2(x,y,z,w));
    };    

    fourDFunction act = [blockToCopy copy];
    [f1 release];
    [f2 release];

    //Need to release act at some point
    return act; 

};

//For applying the weight to the function
fourDFunction (^weightCombineFour) (CGFloat x, fourDFunction f1) =
 ^(CGFloat weight, fourDFunction f1)
{
    fourDFunction blockToCopy = ^(CGFloat x, CGFloat y, CGFloat z, CGFloat w){

        return weight*f1(x,y,z,w);
    };

    fourDFunction act = [blockToCopy copy];
    [f1 release];
    //[act release];
    //Need to release act at some point
   return act;

};

Then, for each node in the network, I simply applied the activation function to the sum of the fourD functions from the source neurons multiplied by their connection weight. After composing all those blocks, I took the final functions from each output. Therefore, my outputs are separate 4D functions of the inputs.

Thanks for the help, this was very cool.

Was it helpful?

Solution

You can do this with blocks. Something like:

//specify some parameters
int parameter1 = 42;
int parameter2 = 54;
//create your block
int (^myBlock)(int) = ^(int parameter3){
  return parameter1 * parameter2 * parameter3;
};
//copy the block off the stack
myBlock = [myBlock copy];
//stash the block somewhere so that you can pull it out later
[self saveBlockOffSomewhereElse:myBlock underName:@"myBlock"];
//balance the call to -copy
[myBlock release];

And then elsewhere...

int (^retrievedBlock)(int) = [self retrieveBlockWithName:@"myBlock"];
int theAnswer = retrievedBlock(2);  //theAnswer is 4536

If you have a string representing some math to evaluate, you could check out GCMathParser (fast but not extensible) or my own DDMathParser (slower but extensible).

OTHER TIPS

Your idea isn't very stupid. As a matter of fact, LLVM is designed to do exactly that kind of thing (generate code, compile, link, load and run) and it even has libraries to link against and APIs to use.

While you could go down a path of trying to piece together a bunch of blocks or primitives -- a sort of VM of your own -- it'll be slower and probably more maintenance. You'll end up having to write some kind of a parser, write all the primitive blocks, and then piecing it all together.

For code generation, you'll probably still need a parser, obviously, but the resulting code is going to be much much faster because you can crank the optimizer on the compiler up and, as long as you generate just one really big file of code, the compiler's optimizer will be even more effective.

I would suggest, though, that you generate your program and then run it externally to your app. That will prevent the hell that is trying to dynamically unload code. It also means that if the generated code crashes, it doesn't take out your application.

LLVM.org has a bunch of additional details.

(Historical note -- one early form of Pixar's modeling environment was a TCL based system that would emit, literally, hundreds of thousands of lines of heavily templated C++ code.)

Here's another possibility: Use OpenGL.

The sorts of functions you are executing in a neural network are very similar to those performed by GPU's. multiplication/scaling, distance, sigmoids, etc... You could encode your state in a bitmap, generate a pixel shaper as ASCII, compile & link it using the provided library calls, then generate an output "bitmap" with the new state. Then switch the two bitmaps and iterate again.

Writing a pixel shaper is not as hard as you might imagine. In the basic case you are given a pixel from the input bitmap/buffer and you compute a value to put in the output buffer. You also have access to all the other pixels in the input and output buffers, as wall as arbitrary parameters you set global, including "texture" bitmaps which might serve as just an arbitrary data vector.

Modern GPU's have multiple pipelines so you'd probably get much better performance than even native CPU machine code.

Another vote for blocks. If you start with a bunch of blocks representing primitive operations, you could compose those into larger blocks that represent complex functions. For example, you might write a function that takes a number of blocks as parameters, copies each one in turn and uses it as the first parameter to the next block. The result of the function could be a block that represents a mathematical function.

Perhaps I'm talking crazy here due to the late hour, but it seems like the ability of blocks to refer to other blocks and to maintain state should make them very good for assembling operations.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top