Using ets:foldl as a poor man's forEach on every record

https://stackoverflow.com/questions/4360655

08-10-2019
|

Question

Short version: is it safe to use ets:foldl to delete every ETS record as one is iterating through them?

Suppose an ETS table is accumulating information and now it's time to process it all. A record is read from the table, used in some way, then deleted. (Also, assume the table is private, so no concurrency issues.)

In another language, with a similar data structure, you might use a for...each loop, processing every record and then deleting it from the hash/dict/map/whatever. However, the ets module does not have foreach as e.g. lists does.

But this might work:

1> ets:new(ex, [named_table]).
ex
2> ets:insert(ex, {alice, "high"}).
true
3> ets:insert(ex, {bob, "medium"}).
true
4> ets:insert(ex, {charlie, "low"}).
true
5> ets:foldl(fun({Name, Adjective}, DontCare) ->
      io:format("~p has a ~p opinion of you~n", [Name, Adjective]),
      ets:delete(ex, Name),
      DontCare
   end, notused, ex).
bob has a "medium" opinion of you
alice has a "high" opinion of you
charlie has a "low" opinion of you
notused
6> ets:info(ex).
[...
 {size,0},
 ...]
7> ets:lookup(ex, bob).
[]

Is this the preferred approach? Is it at least correct and bug-free?

I have a general concern about modifying a data structure while processing it, however the ets:foldl documentation implies that ETS is pretty comfortable with you modifying records inside foldl. Since I am essentially wiping the table clean, I want to be sure.

I am using Erlang R14B with a set table however I'd like to know if there are any caveats with any Erlang version, or any type of table as well. Thanks!

Solution

Your approach is safe. The reason it is safe is that ets:foldl/3 internally use ets:first/1, ets:next/2 and ets:safe_fixtable/2. These have the guarantee you want, namely that you can kill elements and still get the full traverse. See the CONCURRENCY section of erl -man ets.

For your removal of all elements from the table, there is a simpler one-liner however:

ets:match_delete(ex, '_').

although it doesn't work should you want to do the IO-formatting for each row in which case your approach with foldl is probably easier.

OTHER TIPS

For cases like this we will alternate between two tables or just create a new table every time we start processing. When we want to start a processing cycle we switch the writers to start using the alternate or new table, then we do our processing and clear or delete the old table.

We do this because there might otherwise be concurrent updates to a tuple that we might miss. We're working with high frequency concurrent counters when we use this technique.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow