Automated data collection from a website into my database?

Question 1

You can use cron to schedule tasks.

Your crontab file could look something like this:

# Minute   Hour   Day of Month       Month          Day of Week        Command    
# (0-59)  (0-23)     (1-31)    (1-12 or Jan-Dec)  (0-6 or Sun-Sat)                
    0        1          *             *               *           /usr/bin/python manage.py loaddata fixturename.json

(Or you can use @daily /usr/bin/python manage.py loaddata fixturename.json to run at midnight every night)

See the webfaction documentation: http://docs.webfaction.com/software/general.html#scheduling-tasks-with-cron

Question 2

You could YQL to scrap websites for you and return the results in json format.I extensively use YQL to get data for my apps.Its fast and your server doesn't have to take the load for it .

http://developer.yahoo.com/yql/

To run the script once a day you can try adding it to a cron job

http://docs.webfaction.com/software/general.html#scheduling-tasks-with-cron

http://garrett.im/django/sysadmin/2011/10/03/cron-django-webfaction.html

Question 3

You want to run a CRON job. It's a simle way to get a server to run a job once or repeatedly on any schedule you set.

Also make sure you have permission to screen scrape someone else's content.

Question 4

Cron or celerybeat are good options. Cron is easier, celery gives you more control

http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html