Question

I have a huge database of Twitter users. Basically I need to save three values for each user.

  1. The time the user account is updated (last_update)
  2. The latest tweet id (latest_tweet_id)
  3. The earliest tweet id (earliest_tweet_id)

I would like to move have this data in redis for faster queries. Here's how it works:

Scenario One Every time I update/check a user's profile, I need to save the time of this update. At the same time, I need to capture the user's latest and earliest tweet ids (if they have changed). This bit is easy and I can figure out how redis hashes can easily manage this kind of data. My dilemma however is how to keep these hashes sorted by the last_update value so that I can fetch the least updated records first and that way rotate through all the records in a cyclic manner.

Scenario Two The other option I have is to save the data twice:

  1. As a sorted list where last_update acts as my score and user_id as my value
  2. Have a second hash field where keys are user_ids

This second solution will require querying my sorted list for lowest (least updated) user_id and then use that user_id to fetch tweet_ids from the hashed dataset. But this will duplicate my data and RAM is expensive so I'm seeking a solution that enables sorting of hashes first.

Currently, these queries are being performed via MySQL and I haven't tried either solution as I cannot find a good answer for the first preferred scenario.

Any insights solutions will be appreciated. Thanks.

Was it helpful?

Solution

Scenario Two which uses sorted list is the preferred solution.

Sorted set is efficient and best suited for getting range values like top n number of values based on the score.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top