Question

I am working on analytics, For finding the location of the user i am using maximind data.

Currently i am using kuno to fetch the location information using ip address.

For Example:

Input is ip address(1.0.0.0)

output is location(Australia)

Problem

kuno uses .dat or .csv file to fetch location using ip. My app should suppose to handle 1000 req/sec. Each request should fetch the location information, so opening the .dat or .csv file 1000 times throwing error and some lack in performance.

So i decided to port the data(csv) file to redis database.

Below are the format of CSV file:

Start ip,   End ip,   Location

"1.0.0.0","1.0.0.255","Australia"
"1.0.1.0","1.0.3.255","China"
"1.0.4.0","1.0.7.255","Australia"
"1.0.8.0","1.0.15.255","China"
"1.0.16.0","1.0.31.255","Japan"
"1.0.32.0","1.0.63.255","China" 

Two things i got confused

  1. How to find the range of given ip address by match with start and end ip address?

  2. Using redis db for this purpose is good approach?

Any help or suggestion is great.

I am ready to explain more if it is confusing

Was it helpful?

Solution

Using Redis for this is a great idea.

Assuming you are handling IPV4 addresses only, you could use the three first parts of the IP address. It doesn't look like the last part is significant when looking for the country.

I think you could store all of the data in a unique redis sorted set. The values will be the name of countries. And the score the end IP address, converted to decimal.

By example:

"1.0.0.0","1.0.0.255","Australia"
 1.0.0 ==> score = 1x256x256 + 0 x 256 + 0, gives you a score of 65536 for Australia

Register this in a Redis Sorted Set named 'countries':

ZADD countries 65536 "Australia@65536"

Nb: I concatenate the score with the country name just to ensure the stored value is unique. It will allow to register more than one IP range for each country.

1.0.1 ==> score = 1x256x256 + 0 x 256 + 1, i.e. score = 65537 for China

ZADD countries 65537 "China@65537"

1.0.7 ==> score = 65543 for Australia

1.0.15 ==> score = 65551 for China

And so on...

When you want to check a given IP address, apply the same algorithm. By example if you want to find the country for 1.0.5.23, you compute a score of

1x65536 + 0x256 + 5 = 65541

Now you make a request to Redis:

ZRANGEBYSCORE countries 65541 +inf LIMIT 0 1

It will return you the first value, with a score at least equal to 65541. It's the name of the country (of course you'll have to drop the last part of the returned string).

The performance of the search will depend on the number of items in the sorted set. If N is this number, the time complexity of the search will be O(Log(N)).

I have no idea of the number of items in your set (that it to say the number of IP ranges), but if you have any performance problem, you can split data in multiple sorted sets.

Use the first part of the IP address as a part of the sorted set key (countries:1 stores the data for IP address from 1.0.0.0 to 1.255.255.255, countries:2 stores the data for IP address from 2.0.0.0 to 2.255.255.255, etc).

Then use the same principle as above but with a score computed from the second and third parts of the end IP address, and look in the sorted set corresponding to the first part.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top