Mnesia: how to use indexed operations correctly when selecting rows based on criteria involving multiple, indexed columns

https://stackoverflow.com/questions/12934505

07-07-2021
|

Question

Problem:

How to select records efficiently from a table where the select is based on criteria involving two indexed columns.

Example

I have a record,

#rec{key, value, type, last_update, other_stuff}

I have indexes on key (default), type and last_update columns
type is typically an atom or string
last_update is an integer (unix-style milliseconds since 1970)

I want, for example all records whose type = Type and have been updated since a specific time-stamp.

I do the following (wrapped in a non-dirty transaction)

lookup_by_type(Type, Since) ->
    MatchHead = #rec{type=Type, last_update = '$1', _= '_'},
    Guard = {'>', '$1', Since},
    Result = '$_',
    case mnesia:select(rec,[{MatchHead, [Guard],[Result]}]) of
    []    -> {error, not_found};
    Rslts -> {ok, Rslts}
    end.

Question

Is the lookup_by_type function even using the underlying indexes?
Is there a better way to utilize indexes in this case
Is there an entirely different approach I should be taking?

Thank you all

La solution

One way, which will probably help you, is to look at QLC queries. These are more SQL/declarative and they will utilize indexes if possible by themselves IIRC.

But the main problem is that indexes in mnesia are hashes and thus do not support range queries. Thus you can only efficiently index on the type field currently and not on the last_update field.

One way around that is to make the table ordered_set and then shove the last_update to be the primary key. The key parameter can then be indexed if you need fast access to it. One storage possibility is something like: {{last_update, key}, key, type, ...}. Thus you can quickly answer queries because last_update is orderable.

Another way around it is to store last-update separately. Keep a table {last_update, key} which is an ordered set and use that to limit the amount of things to scan on the larger table in a query.

Remember that mnesia is best used as a small in-memory database. Thus scans are not necessarily a problem due to them being in-memory and thus pretty fast. Its main power though is the ability to do key/value lookups in a dirty way on data for quick query.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow