Question

I have two Oracle tables, an old one and a new one. The old one was poorly designed (more so than mine, mind you) but there is a lot of current data that needs to be migrated into the new table that I created.

The new table has new columns, different columns.

I thought of just writing a PHP script or something with a whole bunch of string replacement... clearly that's a stupid way to do it though.

I would really like to be able to clean up the data a bit along the way as well. Some it was stored with markup in it (ex: "
First Name
"), lots of blank space, etc, so I would really like to fix all that before putting it into the new table.

Does anyone have any experience doing something like this? What should I do?

Thanks :)

Was it helpful?

Solution

I'd checkout an ETL tool like Pentaho Kettle. You'll be able to query the data from the old table, transform and clean it up, and re-insert it into the new table, all with a nice WYSIWYG tool.

Here's a previous question i answered regarding data migration and manipulation with Kettle.
Using Pentaho Kettle, how do I load multiple tables from a single table while keeping referential integrity?

OTHER TIPS

I do this quite a bit - you can migrate with simple select statememt:

create table newtable as select 
 field1,
 trim(oldfield2) as field3,
 cast(field3 as number(6)) as field4,
 (select pk from lookuptable where value = field5) as field5,
 etc,
from
 oldtable

There's really very little you could do with an intermediate language like php, etc that you can't do in native SQL when it comes to cleaning and transforming data.

For more complex cleanup, you can always create a sql function that does the heavy lifting, but I have cleaned up some pretty horrible data without resorting to that. Don't forget in oracle you have decode, case statements, etc.

If the data volumes aren't massive and if you are only going to do this once, then it will be hard to beat a roll-it-yourself program. Especially if you have some custom logic you need implemented. The time taken to download, learn & use a tool (such as pentaho etc.) will probably not worth your while.

Coding a select *, updating columns in memory & doing an insert into will be quickly done in PHP or any other programming language.

That being said, if you find yourself doing this often, then an ETL tool might be worth learning.

I'm working on a similar project myself - migrating data from one model containing a couple of dozen tables to a somewhat different model of similar number of tables.

I've taken the approach of creating a MERGE statement for each target table. The source query gets all the data it needs, formats it as required, then the merge works out if the row already exists and updates/inserts as required. This way, I can run the statement multiple times as I develop the solution.

Depends on how complex the conversion process is. If it is easy enough to express in a single SQL statement, you're all set; just create the SELECT statement and then do the CREATE TABLE / INSERT statement. However, if you need to perform some complex transformation or (shudder) split or merge any of the rows to convert them properly, you should use a pipelined table function. It doesn't sound like that is the case, though; try to stick to the single statement as the other Chris suggested above. You definitely do not want to pull the data out of the database to do the transform as the transfer in and out of Oracle will always be slower than keeping it all in the database.

A couple more tips:

  • If the table already exists and you are doing an INSERT...SELECT statement, use the /*+ APPEND */ hint on the insert so that you are doing a bulk operation. Note that CREATE TABLE does this by default (as long as it's possible; you cannot perform bulk ops under certain conditions, e.g. if the new table is an index-organized table, has triggers, etc.
  • If you are on 10.2 or later, you should also consider using the LOG ERRORS INTO clause to log rejected records to an error table. That way, you won't lose the whole operation if one record has an error you didn't expect.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top