Question

I have a huge database of Twitter users. Basically I need to save three values for each user.

  1. The time the user account is updated (last_update)
  2. The latest tweet id (latest_tweet_id)
  3. The earliest tweet id (earliest_tweet_id)

I would like to move have this data in redis for faster queries. Here's how it works:

Scenario One Every time I update/check a user's profile, I need to save the time of this update. At the same time, I need to capture the user's latest and earliest tweet ids (if they have changed). This bit is easy and I can figure out how redis hashes can easily manage this kind of data. My dilemma however is how to keep these hashes sorted by the last_update value so that I can fetch the least updated records first and that way rotate through all the records in a cyclic manner.

Scenario Two The other option I have is to save the data twice:

  1. As a sorted list where last_update acts as my score and user_id as my value
  2. Have a second hash field where keys are user_ids

This second solution will require querying my sorted list for lowest (least updated) user_id and then use that user_id to fetch tweet_ids from the hashed dataset. But this will duplicate my data and RAM is expensive so I'm seeking a solution that enables sorting of hashes first.

Currently, these queries are being performed via MySQL and I haven't tried either solution as I cannot find a good answer for the first preferred scenario.

Any insights solutions will be appreciated. Thanks.

Était-ce utile?

La solution

Scenario Two which uses sorted list is the preferred solution.

Sorted set is efficient and best suited for getting range values like top n number of values based on the score.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top