I split this answer into two sections for the sake of clarity.
IP Geolocation
You may want to stick with MaxMind unless you have a very good reason to question the MaxMind data. I built a very similar service to the one you are describing a few years ago and, like you, wanted a way to verify MaxMind's accuracy. I evaluated 10+ IP geolocation solutions running the entire gamut; free JSON APIs to enterprise-centric, database subscriptions. It became apparent rather quickly that most of the platforms were either using MaxMind directly or combining MaxMind data with metadata from other sources. The spelling, capitalization, and common abbreviations of ISP metadata
This paper, despite being a few years old, is also quite telling. The authors ascertain the accuracy of a handful of IP geolocation tools (including MaxMind) by comparing their results to a dataset they refer to as "ISP Groundtruth", a mashup of EU ISP router data and the actual GPS coordinates of the routers. The paper puts forth a technical explanation of inaccurate geolocation data at the city level.
Proxy Scanning
With respect to automated proxy scanning, I highly recommend checking out nmap and its Lua-based scripting engine (NSE). Here are a few scripts and libraries you may find useful:
- open proxy detection
- proxy testing
- ip geolocation
- database support