Question

I am looking for a standard way to implement a multiset (a "bag") in MySQL. The values this multiset may contain are strings only.

The reason behind it is counting; I have a list of events, which I cannot predefine, and I wish to count their occurrences. In Python, for example, this can be done via a Counter.

Previously I have asked a question about sets in MySQL; the best solution I've found so far is putting comma-delimited strings in the database using the TEXT data type, and then using FIND_IN_SET to see whether an element is in the set or not. However, this solution is not good for multisets, as storing a string, say, a thousand times in a text field, and then count is, is not that efficient...

The use case is this: whenever an event occurs during the run of my (Python) script, which is related to given rows, I wish to access the database and add that event to each of these rows in the database; in the end, I wish to count the number of occurrences of each event in each of the rows. It is more important for me that the insertion of the data will be efficient (comparing to the final calculations).

EDIT

My original data table contains thousands of rows, each should have a "multiset field" (one way or another). Each such multiset may contain not too many (say, less than 20) distinguished values, but each such value may appear many (say, more than 500) times in the multiset. Hence a string (or other representations) of each value by its own, waiting to be counted later, may be inefficient (to my understanding). For example, a different table in which any event of any (original) row will make a row, may quickly grow very large (millions of rows).

Was it helpful?

Solution

Given this is your existing table:

create table table1(thekey int primary key, random_info varchar(10))

Create your event-table:

create table table1_event(thekey int not null, event varchar(100) not null, 
                     counter int, primary key(thekey, event))

For each event:

insert into table1_event values(<a key>, 'the event', 1)
  on duplicate key update counter=counter+1

Summary of events:

select table1.thekey, table1_event.event, table1_event.counter
from table1 left outer join table1_event on table1.thekey=table1_event.thekey

Edited to reflect changed question and comment from poster

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top