UPDATE: From the discogs documentation: Requests are throttled by the server to one per second per IP address. Your application should (but doesnt have to) take this into account and throttle requests locally, too.
The bottleneck seems to be at the (discogs) server end, retrieving individual releases. There is nothing you can really do about that, except give them money for faster servers.
My suggestion would be to to cache the results, it's probably the only thing that will help. Rewrite discogs.APIBase._response, as follows:
def _response(self):
if not self._cached_response:
self._cached_response=self._load_response_from_disk()
if not self._cached_response:
if not self._check_user_agent():
raise UserAgentError("Invalid or no User-Agent set.")
self._cached_response = requests.get(self._uri, params=self._params, headers=self._headers)
self._save_response_to_disk()
return self._cached_response
An alternative approach is to write requests to a log and say "we don't know, try again later", then in another process, read the log, download the data, store it in a database. Then when they come back later, the requested data will be there ready.
You would need to write _load_response_from_disk() and _save_response_to_disk() yourself - The stored data should have _uri, _params, and _headers
as the key, and should include a timestamp with the data. If the data is too old (under the circumstances, I would suggest in the order of months - I have no idea if the numbering is persistent - I would guess trying days - weeks initially), or not found, return None. The storage would have to handle concurrent access, and fast indexes - probably a database.