Redis - Datastructure to store frequent itemsets

Question

To achieve a behavior similar to like condition following could be done:

Solution 1:

Example Dataset

In - this - case - 3
Other - items - 2
This - is - an - 5
Lorem - ipsum - 3
In - other - terms - 2

Ans 1: List or set could be used as data structure depending on usage. In your case duplicate keys exist ( "In") hence using list.

Ans 2: this how list can be used:

Do keep in mind Redis list behaves like linked list.

$ redis-cli lpush In.list "In - this - case - 3"
OK

$ redis-cli lpush Other.list "Other - items - 2"
OK

$ redis-cli lpush This.list "This - is - an - 5"
OK

$ redis-cli lpush Lorem.list "Lorem - ipsum - 3"
OK

$ redis-cli lpush In.list "In - other - terms - 2"
OK

$redis-cli lrange In.list 0 -1
1) "In - other - terms - 2"
2) "In - this - case - 3"

Solution 2:

Other solution would be using list again:

We will have four main lists which will behave like columns in database and separate lists for words, these lists will store the index at which they are present in primary key list.

Sample data can depicted as:

Index Column1 Column2 Column3 Column4

 1    In         this    case     3
 2    Other      items   " "      2
 3    This       is      an       5
 4    Lorem      ipsum   " "      3
 5    In         other   terms    2

This depiction is valid if max 4 values are returned. We can have a dynamic columns also. For dynamic columns 1st column would be key and 2nd key would numeric part and remaining columns will have strings.

Index Column1 Column2 Column3 Column4 Column5

 1    In         3      this    case     " "
 2    Other      2      items   " "      " "
 3    This       5      an      " "      " "
 4    Lorem      3      ipsum   " "      " "
 5    In         2      other   terms    " "
 6    Hello      4      world   !         !

Continuing with fixed 4 columns solution:

   //first row
   $ redis-cli lpush column1 "In"
   1

   $ redis-cli lpush In.list 1
   1

   $ redis-cli lpush column2  "this"
   1
   $ redis-cli lpush column3  "case"
   1
   $ redis-cli lpush column4  3
   1

   //second row
   $ redis-cli lpush column1  "Other"
   2

   $ redis-cli lpush Other.list 2
   1

   $ redis-cli lpush column2  "items"
   2
   $ redis-cli lpush column3  " "
   2
   $ redis-cli lpush column4  2
   2

   //on same lines add 3rd, 4th row and then 5th row
   $ redis-cli lpush column1  "In"
   5

   $ redis-cli lpush In.list 5
   2

   $ redis-cli lpush column2  "items"
   5
   $ redis-cli lpush column3  " "
   5
   $ redis-cli lpush column4  2
   5

   To fetch data you can do something like :
   $ redis-cli lrange In.list 0 -1
   1) 5
   2) 1

   Using these to values as index query columns as
   $redis-cli lindex column1 5
   "In"

   $redis-cli lindex column2 5
   "other"
   $redis-cli lindex column3 5
   "terms"
   $redis-cli lindex column4 5
   2

But with second solution we introduce the cost of insert each string in separate list, but you could use bulk operations to perform them. Also we save blanks to have well defined row type implementation.

Solution 3:

Create structures for each row and serialize them store them in specific key list.

row 1 "In,this,case,3"

 lpush In.list StructureRepresent1stRow

This solution could be opted if you want to use structures and you complex values to be stored.