Scrapy / Python and SQL Server

Question 1

Yes, but you'd have to write the code to do it yourself since scrapy does not provide an item pipeline that writes to a database.

Have a read of the Item Pipeline page from the scrapy documentation which describes the process in more detail (here's a JSONWriterPipeline as an example). Basically, find some code that writes to a SQL Server database (using something like PyODBC) and you should be able to adapt that to create a custom item pipeline that outputs items directly to a SQL Server database.

Question 2

Super late and completely self promotion here, but I think this could help someone. I just wrote a little scrapy extension to save scraped items to a database. scrapy-sqlitem

It is super easy to use.

pip install scrapy_sqlitem

Define Scrapy Items using SqlAlchemy Tables

from scrapy_sqlitem import SqlItem

class MyItem(SqlItem):
    sqlmodel = Table('mytable', metadata
        Column('id', Integer, primary_key=True),
        Column('name', String, nullable=False))

Add the following pipeline

from sqlalchemy import create_engine

class CommitSqlPipeline(object):

    def __init__(self):
            self.engine = create_engine("sqlite:///")

    def process_item(self, item, spider):
            item.commit_item(engine=self.engine)

Don't forget to add the pipeline to settings file and create the database tables if they do not exist.

http://doc.scrapy.org/en/1.0/topics/item-pipeline.html#activating-an-item-pipeline-component

http://docs.sqlalchemy.org/en/rel_1_1/core/tutorial.html#define-and-create-tables