Question

Maxmind has two free GeoLite2 cvs files that we can use to look up locations associated with IP address ranges. I want to create a stored proc in SQL Server that uses the imported tables ([GeoLite2_City_Blocks] and [GeoLite2_City_Locations]), taking a varchar(45) string passed in IP address and returns the location associated with that IP address. I would have thought related T-SQL code would exist on the web, but I cannot find it.

I checked out Querying GeoLite2 Country CSV in SQL, but I simply don't understand what Prefix_Length (which is called Network_mask_Length in the Blocks table), does or is. If they are the same as CIDR Prefixes (see: http://en.wikipedia.org/wiki/IPv6_subnetting_reference), I am even more confused, since the Network_mask_Length could be 118, 119, or any of many other numbers not shown on that Wikipedia page.

I have seen a web page talking about IPv6 in SQL Server that suggested padding with zeros for doing comparisons. I suppose that the free GeoLite2 cannot simply be queried without at least an additional column being added to the table for the end IP address of the range?

I suppose I could strip out "::ffff:" from in front of the IPv4 addresses, if that is the correct way to say that. But still I don't get how to translate that network_mask_length (prefix) field into something I understand. And then zero pad the IPv6 addresses, but then I still don't know what that prefix is about.

My questions are:

  1. Is there sample code available anywhere which can get me on the right track? Or,
  2. What do I need to do to create a proc that will accept an address in IPv4 or IPv6, and return the relevant location or geoname_id?
Was it helpful?

Solution

The prefix length is the number of significant bits in the network address. For any IPv6 address, this is a number between 0-128 as IPv6 addresses are 128 bits in length. For instance an network with a prefix of 128 would contain just one IP address. A network with a prefix of 127 would contain two addresses. 126 would have 4, and so on.

IPv6 addresses of the form ::FFFF:10.11.12.13, as you note, correspond to the IPv4 address with the '::FFFF:' removed. The '::FFFF:' corresponds to the first 96 bits of the address. If the IPv6 address had a prefix length of 126, the IPv4 address would have a prefix length of 30 (i.e., the last two bits are significant).

A Google search should reveal code snippets to convert CIDR notation to ranges. If you only care about IPv4, it might be easiest to pre-filter the data, removing the ::FFFF:, subtracting 96 from the prefix length, and discarding the address ranges not mapped to IPv4.

OTHER TIPS

I used this query (actually its MySQL) to import the Block-File, convert the IP adresses to from-to ip ranges to do a simple BETWEEN query.

I added two fields to my table: start_ip and end_ip. start_ip is filled with the numeric value of network_start_ip, end_ip is filled with the end of the ip range calculated using the network_mask_length. As @oschwald noted, I substracted 96 from the prefix length and removed the ::FFFF:.

LOAD DATA LOCAL INFILE 'GeoLite2-City-Blocks.csv' 
INTO TABLE geoip_blocks 
COLUMNS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"' 
LINES STARTING BY '::ffff:' -- import only ipv4 and cut off the ipv6 prefix
IGNORE 2 LINES 
(network_start_ip, network_mask_length, geoname_id, 
  registered_country_geoname_id, represented_country_geoname_id, postal_code, 
  latitude, longitude, is_anonymous_proxy, is_satellite_provider)
SET start_ip = inet_aton(network_start_ip),  -- set start ip as given
  end_ip = inet_aton(network_start_ip) + 
   (pow(2, (32 - (network_mask_length -96))) - 1) - 1, -- calc end_ip using mask
  network_mask_length = network_mask_length - 96 -- subtract 96 as we use ipv4

Query the data using

SELECT geoname_id
FROM geoip_blocks
WHERE
INET_ATON('123.123.123.123') BETWEEN start_ip AND end_ip
LIMIT 1
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top