The issue might be either long loading times of a site, or a cycle in your website links' graph - i.e. page1 (Main Page) has link to page2 (Terms of Service) which in turn has link to page1. You could try this snippet to see how long it takes to get a response from a website (snippet usage included).
Regarding your last question:
I'm pretty sure requests
doesn't parse your response's content (except for .json()
method). What you might be experiencing is a link to a resource, like <a href="http://www.example.com/very_big_file.exe">Free Cookies!</a>
which you script would visit. requests
have mechanics to counter such case, see this for reference. Moreover, the aforementioned technique allows checking Content-Type
header to make sure you're downloading pages you're interested in.