Question

I've come up with a problem lately. We wanted to get a table of data from an Oracle DB in batches, where a batch is for example 4-5000. My previous "general" solution was to wrap the select into some kind of following thing:

SELECT * FROM (
    SELECT ROWNUM AS RN, INNERSELECT.* FROM (
        select THINGS1, THINGS2 from MYTABLE)
    )
INNERSELECT WHERE ROWNUM < 4001) WHERE RN >= 3001

Seemed to be slow as when the rownum borders are high (for example 4-500.000) then a select will take more than 10-20 seconds. And the time is increasing linearily as I increase the values. Of course I know why, so I wanted to find another solution to get data in batch but with a guarantied time frame. Well then I used this kind of select:

SELECT * FROM (
  SELECT INNERSELECT.*, ROW_NUMBER() OVER (ORDER BY INNERSELECT.ROWID) RN FROM 
    ( select THINGS1, THINGS2 from MYTABLE)
    INNERSELECT
) WHERE RN BETWEEN 3001 AND 4000;

Which is a bit slow (It took 3-4 seconds to get a batch of data for a test), but I could increase the batch size without noticable performance drop and it will be the same for high values. The current problem is that I select from several tables in multiple threads from the DB and the performance drops to hell if I just use 3 threads (and we are not talking about the 16 which is currently used for a project). The JDBC connections are pooled, the performance drop is on the server side. I think it is obvious that the select is eating the CPU time so I cannot do anything, but I hope that you all have some hints or tips how to optimize this.

Other problem is that the tool we use is not project specific, it has to be general as possible so I cannot use copy tables for example, etc.

Thanks for the answers in advance

Was it helpful?

Solution

Assuming that MYTABLE has a numeric primary key ID, one tactic would be to have each of the 16 threads execute this query:

SELECT THINGS1, THINGS2
FROM   MYTABLE
WHERE  MOD(ID,15) = :THREAD_NUMBER

Each thread would get a unique value for :THREAD_NUMBER from 0 to 15. That means, each thread will get (approximately) 1/16th of the rows.

OTHER TIPS

SELECT * FROM (
  SELECT INNERSELECT.*, ROW_NUMBER() OVER (ORDER BY INNERSELECT.ROWID) RN FROM 
    ( select THINGS1, THINGS2 from MYTABLE)
    INNERSELECT
) WHERE RN BETWEEN 3001 AND 4000;

( select THINGS1, THINGS2 from MYTABLE)

How many rows does this retrieve? If you can place some sort of restriction clause on this - it would avoid the system having to do a FULL TABLE SCAN or INDEX SCAN.

ORDER BY INNERSELECT.ROWID

Is it necessary to issue this sort? Especially since the sort criteria is a system generated value. A large sort can have huge impacts on the database.

The important thing to remember is in the extraction of records using ROWNUM in Oracle - the database applies ROWNUM to the records as one of the last things ~ but prior to an ORDER BY.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top