Question

How do I let my users apply their own custom formula to a table of data to derive new fields?

I am working on a Django application which is going to store and process a lot of data for subscribed users on the open web. Think 100-10,000 sensor readings in one page request. I am going to be drawing graphs using this data and also showing tables containing it. I expect groups of sensors to be defined by my users, who will have registered themselves on my website (i.e they correspond with a django model).

I would like to allow the user to be able to create fields that are derived from their sensor data (as part of a setup process). For example, the user might know that their average house temperature is (temperature sensor1 + temperature sensor2) / 2 and want to show that on the graph. They might also want something more interesting like solar hot water heated is (temp out - temp in) * flow * conversion constant. I will then save these defined formulas for them and everyone else who views this page of sensor data.

The main question is how do I define the formula at the centre of the system. Do I just have a user-defined string to define the formula (say 100 chars long) and parse it myself - replace the user defined with an input sample and call it toast?

Update

In the end I got just the answer I asked for : A safe way to evaluate a stored user function on the server. Evaluating the same function also on the client when the function is being defined will be a great way to make the UI intuitive.

Was it helpful?

Solution

Depends on who your clients are.

If this is "open to the public" on the WWW, you have to parse expressions yourself. You can use the Python compiler to compile Python syntax. You can also invent your own compiler for a subset of Python syntax. There are lots of examples; start with the ply project.

If this is in-house ("behind the firewall") let the post a piece of Python code and exec that code.

Give them an environment from math import * functionality available.

Fold the following around their supplied line of code:

def userFunc( col1, col2, col3, ... ):
   result1= {{ their code goes here }}
   return result1

Then you can exec the function definition and use the defined function without bad things happening.

While some folks like to crow that exec is "security problem", it's no more a security problem than user's sharing passwords, and admin's doing intentionally stupid things like deleting important files or turning the power off randomly while your programming is running.

exec is only a security problem if you allow anyone access to it. For in-house applications, you know the users. Train them.

OTHER TIPS

I would work out what operations you want to support [+,-,*,/,(,),etc] and develop client side (javascript) to edit and apply those values to new fields of the data. I don't see the need to do any of this server-side and you will end up with a more responsive and enjoyable user experience as a result.

If you allow the user to save their formulas and re-load them when they revisit the site, you can get their browser to do all the calculations. Just provide some generic methods to add columns of data which are generated by applying one of their forumla's to your data.

I imagine the next step would be to allow them to apply those operations to the newly generated columns.

Have you considered posting their data into a google spreadsheet? This would save a lot of the development work as they already allow you to define formulas etc. and apply it to the data. I'm not too sure of the data limit (how much data you can post and operate on) mind you.

Another user asked a similar question in C. In that post, Warren suggested that the formula could be parsed and converted from

(a + c) / b 

Into reverse polish notation

a c + b / 

Which is easier to process.

In this case, you could intercept the formula model's save and generate the postfix notation from the user-defined formula. Once you have postfix notation, it is fairly straightforward to write a loop that evaluates the formula from left to right.

As for implementation in Django, the core question remaining is how to map different input fields into the formula. The simple solution would be a model representing the derived field uses a many-to-many relationship with the symbol name ("a", "b" or "c") defined per-input.

If performance is really critical, you might somehow further pre-process the postfix formula before applying it to the data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top