Question

I'm trying to create a query so that all items of a given list (parameter) are contained in a a table's column (which is also a list). I also need a query so that at least one item of a given list (parameter) are contained in a table's column. For example:

JDO:

Table: User
| ID | Name | Interests <List of Strings> |

Query:

List <String> gifts; // Item to query with
  1. How can I query for all users whose interests match ALL gifts? i.e. ALL of gifts should be a subset of Interests.

  2. How can I query for all users whose interests match SOME (at least one) gift? i.e. at least one gift is a subset of the interests.

  3. How can I query for all users whose ALL interests match gifts? i.e. ALL of interests should be a subset of gifts.

  4. How can I query for all users whose SOME (at least one) interests match gifts? i.e. at least one interest is a subset of the gifts.

Are these queries possible? If so then how? Can I use the .contains() keyword to do these queries? If so, then how? Can anyone share some examples? Any help would be highly appreciated.

Thank you.

Was it helpful?

Solution

In order to understand how these sort of queries work (or don't), it's important to understand how the datastore indexes entities. When you insert an entity with a list property, the datastore breaks out each list entry into a separate index row. For example, the following entity:

Entity(User):
  id=15
  name="Jim"
  interests=["Drinking", "Banjos"]

Will result in the following index entries in the automatic indexes:

(User, "id", 15, Key("User", 1))
(User, "name", "Jim", Key("User", 1))
(User, "interests", "Drinking", Key("User", 1))
(User, "interests", "Banjos", Key("User", 1))

If you've also defined a composite index on (name, interests), the entries in that will look something like this:

("Jim", "Drinking", Key("User", 1))
("Jim", "Banjos", Key("User", 1))

With that established, we can address your specific querying queries, in order:

  1. This isn't possible, because each entry is on its own index row. You could index this by creating a list with one entry for every subset of interests, and do an equality query on that but this grows rapidly as the number of interests increases, so it's probably a bad idea if you expect the user to have more than, say, 5 interests, or unless you can put more constraints on the problem.
  2. You can use an "IN" filter - eg, 'SELECT * FROM User WHERE Interests IN ["Drinking", "Banjos"]' - this will match any record that has at least one of "Drinking" and "Banjos" as interests. Note that this will be broken out into multiple equality subqueries by the SDK, so it's equivalent to executing as many queries as you have entries in the query list.
  3. This is the inverse of 1. Again, it's not really practical, unless you've stored the complete (sorted) list of interests as a string, and you're prepared to execute a separate query for every subset of your list of gifts.
  4. If I'm not mistaken, this is the same as 2 - you're looking for any intersection between the two sets, which is a symmetrical operation.

For numbers 1 and 3, you may be able to get by by doing a coarser filter - for instance, searching on one or two interests - and filtering the results in memory for exact matches.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top