Question

I have some very large tables (to me anyway), as in millions of rows. I am loading them from a legacy system and it is taking forever. Assuming hardware is ok that is fast. How can I speed this up? I have tried exporting from one system into CSV and used Sql loader - slow. I have also tried a direct link from one system to another so there is no middle csv file, just unload from one load into another.

One person said something about pre-staging tables and that somehow could make things faster. I don't know what that is or if it could help. I was hoping for input. Thank you.

Oracle 11g is what is being used.

update: my database is clustered so I don't know if I can do anything to speed things up.

Was it helpful?

Solution

What you can try:

  • disabling all constraints and only enabling them after the load process
  • CTAS (create table as select)

What you really should do: understand what is you bottleneck. Is it network, file I/O, checking constraints ... then fix that problem. For me looking at the explain plan is most of the time the first step.

OTHER TIPS

As Jens Schauder suggested, if you can connect to your source legacy system via DB link, CTAS would be the best compromise between performance and simplicity, as long as you don't need any joins on the source side.

Otherwise, you should consider using SQL*Loader and tweaking some settings. Using direct path I was able to load 100M records (~10GB) in 12 minutes on a 6 year old ProLaint.

EDIT: I used the data format defined for the Datamation sort benchmark. The generator for it is available in the Apache Hadoop distribution. It generates records with fixed width fields with 99 bytes of data plus a newline character per line of file. The SQL*Loader control file I used for the numbers quoted above was:

OPTIONS (SILENT=FEEDBACK, DIRECT=TRUE, ROWS=1000)
LOAD DATA
INFILE 'rec100M.txt' "FIX 99"
INTO TABLE BENCH (
BENCH_KEY POSITION(1:10),
BENCH_REC_NBR POSITION(13:44),
BENCH_FILLER POSITION(47:98))

What is the configuration you are using? Does the database where the data is imported have something like a standby database coupled to it? If so, it is very likely to have a configuration with force_logging enabled? You can check this using

SELECT FORCE_logging from v$database;

It can also be enabled at tablespace level:

SELECT TABLESPACE_name,FORCE_logging from DBA_tablespaces

If your database is running ith force_logging, or your tablespace has force_logging, this will have impact on the import speed. If this is not the case, check if archivelog mode is enabled.

SELECT LOG_mode from v$database;

If so, it could be that the archives are not written fast enough. In that case increase the size of the online redolog files. If the database is not running archivelog mode, it still has to write to the redo files, if not using direct path inserts. In that case, check how quick the redo's can be written. Normally, 200GB/h is very well possible, when indexes are not playing a role.

Important is to find what link is causing the lack of performance. It could be the input, it could be the output. Here I focused on the output.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top