Mnesia Query Cursors - Working with them in Practical applications

https://stackoverflow.com/questions/11967660

26-06-2021
|

Question

In most applications, its hard to avoid the need to query large amounts of information which a user wants to browse through. This is what led me to cursors. With mnesia, cursors are implemented using qlc:cursor/1 or qlc:cursor/2. After working with them for a while and facing this problem many times,

11> qlc:next_answers(QC,3).
** exception error: {qlc_cursor_pid_no_longer_exists,<0.59.0>}
     in function  qlc:next_loop/3 (qlc.erl, line 1359)
12>

It has occured to me that the whole cursor thing has to be within one mnesia transaction: executes as a whole once. like this below

E:\>erl
Eshell V5.9  (abort with ^G)
1> mnesia:start().
ok
2> rd(obj,{key,value}).
obj
3> mnesia:create_table(obj,[{attributes,record_info(fields,obj)}]).
{atomic,ok}
4> Write = fun(Obj) -> mnesia:transaction(fun() -> mnesia:write(Obj) end) end.
#Fun<erl_eval.6.111823515>
5> [Write(#obj{key = N,value = N * 2}) || N <- lists:seq(1,100)],ok.
ok
6> mnesia:transaction(fun() -> 
            QC = cursor_server:cursor(qlc:q([XX  || XX <- mnesia:table(obj)])),
            Ans = qlc:next_answers(QC,3),
            io:format("\n\tAns: ~p~n",[Ans]) 
    end).
        Ans: [{obj,20,40},{obj,21,42},{obj,86,172}]
{atomic,ok}
7>

When you attempt to call say: qlc:next_answers/2 outside a mnesia transaction, you will get an exception. Not only just out of the transaction, but even if that method is executed by a DIFFERENT process than the one which created the cursor, problems are bound to happen.

Another intresting finding is that, as soon as you get out of a mnesia transaction, one of the processes which are involved in a mnesia cursor (apparently mnesia spawns a process in the background), exits, causing the cursor to be invalid. Look at this below:

-module(cursor_server).
-compile(export_all).

cursor(Q)->
    case mnesia:is_transaction() of
        false -> 
            F = fun(QH)-> qlc:cursor(QH,[]) end,
            mnesia:activity(transaction,F,[Q],mnesia_frag);
        true -> qlc:cursor(Q,[])
    end.

%% --- End of module -------------------------------------------

Then in shell, i use that method:

7> QC = cursor_server:cursor(qlc:q([XX  || XX <- mnesia:table(obj)])).
{qlc_cursor,{<0.59.0>,<0.30.0>}}
8> erlang:is_process_alive(list_to_pid("<0.59.0>")).
false
9> erlang:is_process_alive(list_to_pid("<0.30.0>")).
true
10> self().
<0.30.0>
11> qlc:next_answers(QC,3).
** exception error: {qlc_cursor_pid_no_longer_exists,<0.59.0>}
     in function  qlc:next_loop/3 (qlc.erl, line 1359)
12>

So, this makes it very Extremely hard to build a web application in which a user needs to browse a particular set of results, group by group say: give him/her the first 20, then next 20 e.t.c. This involves, getting the first results, send them to the web page, then wait for the user to click NEXT then ask qlc:cursor/2 for the next 20 and so on. These operations cannot be done, while hanging inside a mnesia transaction !!! The only possible way, is by spawning a process which will hang there, receiving and sending back next answers as messages and receiving the next_answers requests as messages like this:

-define(CURSOR_TIMEOUT,timer:hours(1)).

%% initial request is made here below
request(PageSize)->
    Me = self(),    
    CursorPid = spawn(?MODULE,cursor_pid,[Me,PageSize]),
    receive
        {initial_answers,Ans} -> 
            %% find a way of hidding the Cursor Pid
            %% in the page so that the subsequent requests
            %% come along with it
            {Ans,pid_to_list(CursorPid)}
    after ?CURSOR_TIMEOUT -> timedout
    end.

cursor_pid(ParentPid,PageSize)->
    F = fun(Pid,N)-> 
            QC = cursor_server:cursor(qlc:q([XX  || XX <- mnesia:table(obj)])),
            Ans = qlc:next_answers(QC,N),
            Pid ! {initial_answers,Ans},
            receive
                {From,{next_answers,Num}} ->
                    From ! {next_answers,qlc:next_answers(QC,Num)},
                    %% Problem here ! how to loop back
                    %% check: Erlang Y-Combinator
                delete -> 
                    %% it could have died already, so we be careful here !
                    try qlc:delete_cursor(QC) of 
                        _ -> ok 
                    catch 
                        _:_ -> ok 
                    end,
                    exit(normal)
            after ?CURSOR_TIMEOUT -> exit(normal)
            end
        end,
    mnesia:activity(transaction,F,[ParentPid,PageSize],mnesia_frag).

next_answers(CursorPid,PageSize)->
    list_to_pid(CursorPid) ! {self(),{next_answers,PageSize}},
    receive
        {next_answers,Ans} ->
            {Ans,pid_to_list(CursorPid)}
    after ?CURSOR_TIMEOUT -> timedout
    end.

That would create a more complex problem of managing process exits, tracking / monitoring e.t.c. I wonder why the mnesia implementers didnot see this !

Now, that brings me to my questions. I have been walking around the web for solutions and you could please check out these links from which the questions arise: mnemosyne, Ulf Wiger's Solution to Cursor Problems, AMNESIA - an RDBMS implementation of mnesia.

1. Does anyone have an idea on how to handle mnesia query cursors in a different way from what is documented, and is worth sharing ?

2. What are the reasons why mnesia implemeters decided to force the cursors within a single transaction: even the calls for the next_answers ?

3. Is there anything, from what i have presented, that i do not understand clearly (other than my bad buggy illustration code - please ignore those) ?

4. AMNESIA (on section 4.7 of the link i gave above), has a good implementation of cursors, because the subsequent calls for the next_answers, do not need to be in the same transaction, NOR by the same process. Would you advise anyone to switch from mnesia to amnesia due to this and also, is this library still supported ?

5. Ulf Wiger , (the author of many erlang libraries esp. GPROC), suggests the use of mnesia:select/4. How would i use it to solve cursor problems in a web application ?

NOTE: Please do not advise me to leave mnesia and use something else, because i want to use mnesia for this specific problem. I appreciate your time to read through all this question.

Solution

The motivation is that transaction grabs (in your case) read locks. Locks can not be kept outside of transactions.

If you want, you can run it in a dirty_context, but you loose the transactional properties, i.e. the table may change between invocations.

make_cursor() ->
    QD = qlc:sort(mnesia:table(person, [{traverse, select}])),
    mnesia:activity(async_dirty, fun() -> qlc:cursor(QD) end, mnesia_frag).

get_next(Cursor) ->
    Get = fun() -> qlc:next_answers(Cursor,5) end,
    mnesia:activity(async_dirty, Get, mnesia_frag).

del_cursor(Cursor) ->
    qlc:delete_cursor(Cursor).

OTHER TIPS

I think this may help you :

use async_dirty instead of transaction

{Record,Cont}=mnesia:activity(async_dirty, fun mnesia:select/4,[md,[{Match_head,[Guard],[Result]}],Limit,read])

then read next Limit number of records:

mnesia:activity(async_dirty, fun mnesia:select/1,[Cont])

full code:

-record(md,{id,name}).
batch_delete(Id,Limit) ->
    Match_head = #md{id='$1',name='$2'},
    Guard = {'<','$1',Id},
    Result = '$_',
    {Record,Cont} = mnesia:activity(async_dirty, fun mnesia:select/4,[md,[{Match_head,[Guard],[Result]}],Limit,read]),
    delete_next({Record,Cont}).

delete_next('$end_of_table') ->
    over;
delete_next({Record,Cont}) ->
    delete(Record),
    delete_next(mnesia:activity(async_dirty, fun mnesia:select/1,[Cont])).

delete(Records) ->
    io:format("delete(~p)~n",[Records]),
    F = fun() ->
        [ mnesia:delete_object(O) || O <- Records]
    end,
    mnesia:transaction(F).

remember you can not use cursor out of one transaction

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow