Question

I'm trying to calculate IRR and NPV for over 600 million records in bigquery. Since bigquery does not have any looping mechanism, or have any IRR functionality or storing values in variables; we are stuck. We are running out of options and are considering calling it quits with bigquery :((

What would be the best way to read each of the 600 million rows, perform certain calculations (IRR) and write them back to a table? Our current option is to take it out of bigquery upload it to oracle, run the calculations, export and put it back in Bigquery. Doing it within bigquery is timing out after running for a few hours.

We are in a conundrum and any advice/tips will be really appreciated.

Was it helpful?

Solution

You may have seen the demo of user-defined functions from the Cloud Platform Live event in march. We're currently looking for trusted testers for this functionality. Essentially, you can write a javascript function that can read and emit rows of data. While this is still an early preview of this functionality, it is pretty robust, using the V8 Javascript execution engine.

If you're interested, please contact your cloud support representative and ask about joining the UDF trusted-tester program. If you don't get a response, please feel free to e-mail me at tigani at google, and I'll route to the right place. (we are limiting the number of trusted testers that we sign up, at least at first, however).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top