سؤال

I want to crawl Indian news websites and their archives (eg. thehindu.com, indianexpress.com and timesofindia.com).

I have heard of boilerplate library in Java used to extract content. But is there any library in python to do this and how t do this?

If this is a repeat question, please help me to point out.

هل كانت مفيدة؟

المحلول

Scrapy is a popular scraping framework for Python

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top