Question

I am having a postgres production database in production (which contains a lot of Data). now I need to modify the model of the tg-app to add couple of new tables to the database.

How do i do this? I am using sqlAlchemy.

Was it helpful?

Solution

This always works and requires little thinking -- only patience.

  1. Make a backup.

  2. Actually make a backup. Everyone skips step 1 thinking that they have a backup, but they can never find it or work with it. Don't trust any backup that you can't recover from.

  3. Create a new database schema.

  4. Define your new structure from the ground up in the new schema. Ideally, you'll run a DDL script that builds the new schema. Don't have a script to build the schema? Create one and put it under version control.

    With SA, you can define your tables and it can build your schema for you. This is ideal, since you have your schema under version control in Python.

  5. Move data.

    a. For tables which did not change structure, move data from old schema to new schema using simple INSERT/SELECT statements.

    b. For tables which did change structure, develop INSERT/SELECT scripts to move the data from old to new. Often, this can be a single SQL statement per new table. In some cases, it has to be a Python loop with two open connections.

    c. For new tables, load the data.

  6. Stop using the old schema. Start using the new schema. Find every program that used the old schema and fix the configuration.

    Don't have a list of applications? Make one. Seriously -- it's important.

    Applications have hard-coded DB configurations? Fix that, too, while you're at it. Either create a common config file, or use some common environment variable or something to (a) assure consistency and (b) centralize the notion of "production".

You can do this kind of procedure any time you do major surgery. It never touches the old database except to extract the data.

OTHER TIPS

The simplest approach is to simply write some sql update scripts and use those to update the database. Obviously that's a fairly low-level (as it were) approach.

If you think you will be doing this a lot and want to stick in Python you might want to look at sqlalchemy-migrate. There was an article about it in the recent Python Magazine.

I'd agree in general with John. One-pass SELECTing and INSERTing would not be practical for a large database, and setting up replication or multi-pass differential SELECT / INSERTs would probably be harder and more error-prone.

Personally, I use SQLAlchemy as an ORM under TurboGears. To do schema migrations I run:

tg-admin sql status

To see the difference between the live and development schemas, then manually write (and version control) DDL scripts to make the required changes.

For those using SQLAlchemy standalone (i.e. not under TurboGears), the sql status functionality is pretty simple and can be found here in the TG source: http://svn.turbogears.org/branches/1.1/turbogears/command/sacommand.py (there's versions for older Python / SA releases in the 1.0 branch, too).

If you are just adding tables, and not modifying any of the tables which have the existing data in it, you can simply add the new sqlAlchemy table definitions to model.py, and run:

tg-admin sql create

This will not overwrite any of your existing tables.

For schema migration, you might take a look at http://code.google.com/p/sqlalchemy-migrate/ although I haven't used it yet myself.

Always take a backup of the production database before migration activity.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top