Question

I am building an application to store the different formulas used for different subjects in school, and I have drawn the following database diagram to depict that scenario:

enter image description here

I am having trouble figuring out how to represent an equation in a relational database.

Each equation is made up of variables of different types. For example,

E=mc^2

Has the following components:

  • E, which is energy in joules(J), which is actually kg*m^2*s^2.
  • m, which is mass in kilograms(kg), which cannot be broken down further
  • c, which is the speed of light, with the value of 3*10^8, and units of m/s

The problem is not the representation of the string of the equation within the database. Rather, I want to represent the relationship between the individual variables within an equation.

How should I go about representing these different types of variables within the same database?

Was it helpful?

Solution

Your question is "how can I show how this big thing (an equation) is made up from these little things (variables)?" This is a bill-of-materials problem, which is a special case of hierarchies. Your example is slightly (but only slightly) more complicated because it requires operators as well as variables.

To evaluate the given equation one would take the value of c, square it, multiply that by m, and assign the result to E. There is an order in which these must be performed and that order must be followed from the inner-most to the outer-most. This could be represented as a tree1:

    E
    |
 Multiply
  |    |
  |    m
Power
 |  |
 c  2

There are several ways to represent trees in a relational schema. For this problem I think the adjacency list would be best. The table would look something like (in very rough pseudo-code)

create table Equation_Tree
(
  row_id         int     not null primary key,
  varialbe_name  string  null,
  function_name  string  null,
  sequence       int     not null,
  parent_row_id  int     null foreign key references Equation_Tree.row_id,

  constraint (exactly one of variable_name, function_name is populated)
)

To evaluate the equation walk the tree from the leaves to the root storing working values in each intermediate node. Variable values can be substituted from local parameters or retrieved from a second table.

Another approach would be to evaluate using reverse polish notation. The operators and operands then form a sequence of tokens in a logical processing order. A talbe for such would look like:

create table Equation_RPH
(
  varialbe_name  string  null,
  function_name  string  null,
  sequence       int     not null,

  constraint (exactly one of variable_name, function_name is populated)
)

While having a simpler schema this does not show the relationship between parts so clearly.


1With thanks to the amazing ASCII Flow for the bestest future-retro software in existence.

OTHER TIPS

Here's one suggestion. Since you used the term component, I'll stick with that, but maybe theres a better term?

CREATE TABLE COMPONENTS
( COMPONENT_ID INT NOT NULL PRIMARY KEY
, ABBREVIATION CHAR(1) NOT NULL -- Example 'E'
, DESCRIPTION VARCHAR(50) NOT NULL -- Example 'Energy'
);

CREATE TABLE EQUATION_COMPONENT
( EQUATION_ID ...
, COMPONENT_ID ...
,    PRIMARY KEY ( EQUATION_ID, COMPONENT_ID )
,    FOREIGN KEY ( EQUATION_ID ) REFERENCES EQUATIONS ( EQUATION_ID )
,    FOREIGN KEY ( COMPONENT_ID ) REFERENCESCOMPONENTS ( COMPONENT_ID )
);

If you, for example, want to find all equations concerning both Mass and Energy:

SELECT EQUATION_ID
FROM EQUATION_COMPONENT
WHERE COMPONENT_ID IN ( id:s for Mass and Energy )
GROUP BY EQUATION_ID
HAVING COUNT(COMPONENT_ID) = 2
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top