Question

What are some things I can do to improve query performance of an oracle query without creating indexes?

Here is the query I'm trying to run faster:

SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM items a,
itempages b,
keygroupdata c
WHERE a.ItemType IN (112,115,189,241)
AND a.ItemNum = b.ItemNum
AND b.ItemNum = c.ItemNum
ORDER BY a.DateStored DESC

None of these columns are indexed and each of the tables contains millions of records. Needless to say, it takes over 3 and half minutes for the query to execute. This is a third party database in a production environment and I'm not allowed to create any indexes so any performance improvements would have to be made to the query itself.

Thanks!

Was it helpful?

Solution

First I'd rewrite the query to be ANSI standard:

SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM items a
INNER JOIN itempages b ON b.ItemNum = a.ItemNum
INNER JOIN keygroupdata c ON c.ItemNum = b.ItemNum
WHERE a.ItemType IN (112,115,189,241)
ORDER BY a.DateStored DESC

This makes it easier to read and understand what is going on. It also helps you not make mistakes (i.e. Cross Joining)that might cause real big problems. Then I'd get the Explain plan to see what the DBMS is doing with that query. Is it trying to use some indexes? Is it joining the tables correctly?

Then I'd review the tables that I'm working with to see if there are any indexes that already exist that I could be using to make my query faster. Finally as everyone else has suggested I'd remove the Order By clause and just do that in code.

OTHER TIPS

Ask the third party to index its join columns, as they should have done in the first place! Without indexes, Oracle has nothing to go on other than brute force.

You may want to try creating a materialized view on any of those tables. You can then create an index on the materialized view that will help speed the query (which would then be querying the materialized view instead of the raw table).

Of course, if your underlying table is updated your view and indexes will need to be refreshed.

First, look at the execution plan. Does it accurately reflect the number of rows to be retrieved at each stage of the query execution? How selective is the predicate "a.ItemType IN (112,115,189,241)"? Does the execution plan show any use of temporary disk space for joins or sorts?

Actually, maybe you can modify the question to include the execution plan.

Also make sure you do not have hash joins disabled, which is sometimes the case in OLTP-tuned systems, as they are the most efficient way of equijoining bulk data in Oracle. They ought to show up in the execution plan.

You can try filtering on item type before you join your tables, as shown here.

If you are running on Oracle prior to 9i, this would sometimes give surprising benefits.

select 
  c.claimnumber,
  a.itemdate, 
  c.dtn,
  b.filepath
from 
  (
  select itemdate
  from items it
  where it.itemtype in(112,115,189,241)
  ) a
  itempages b,
  keygroupdata c
where a.itemnum = b.itemnum
  and b.itemnum = c.itemnum

You can also try adding the hints /+RULE/ or /+ORDERED/ to see what happens... again, particularly with older versions, these would sometimes give surprising results.

SELECT /*+RULE*/
  c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM
  items a,
  itempages b,
  keygroupdata c
WHERE a.ItemType IN (112,115,189,241)
  AND a.ItemNum = b.ItemNum
  AND b.ItemNum = c.ItemNum
ORDER BY a.DateStored DESC

If the query inputs are constant or predictable (the itemType IN (...)), then an alternative would be to run the query once or twice a day and store the results in a local table, with indices where appropriate.

You can then make the costly query 'offline' and have quicker/better results for an interactive query.

Is this a query that you run often? It seems like it would be in the DB owner's interest to create the indexes that you need to speed this query up. The 3.5 minutes you're spending running the query must have some impact on their production environment!

Also, have they been running update statistics on the tables? That might improve performance, as the join order is computed based on the statistics of the tables.

BTW, what are you allowed to do? Just read? If you can create temporary tables and put indexes on those, I might consider making temporary copies of the table, indexing those, and then do the index-assisted join with the temp copies.

I know this thread is very old, but for the search engines I still wanted to offer an alternative solution that will work on oracle and depending on the data might be much faster.

with a as (
  select 
    * 
  from 
    items 
  where 
    ItemType IN (112,115,189,241)
)
SELECT 
  c.ClaimNumber
  , a.ItemDate
  , c.DTN, b.FilePath
FROM 
  a,
  itempages b,
  keygroupdata c
WHERE 
  a.ItemNum = b.ItemNum
  AND b.ItemNum = c.ItemNum
ORDER BY 
  a.DateStored DESC

You can also try the /*+ MATERIALIZE */ hint in the WITH clause.

Actually I find oracle's old join syntax much easier to read than ansi sql ^^

Without indexing, that query is only going to get worse as the table size increases. With that said, try removing the order by clause and doing that sort on the client side.

Sometimes you can see a benefit by adding extra pathways for the optimizer to choose by adding what seems like redundant elements to the where clause.

For instance, you've got A.ItemNum = B.ItemNum AND B.ItemNum = C.ItemNum. Try adding A.ItemNum = C.ItemNum as well. I'm pretty sure, however, that the optimizer is intelligent enough to figure that out on its own - worth a try though.

Depending on the datatype of the ItemType column you may experience faster executing using the following if it's a varchar, Oracle will do implict conversions.

SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
FROM items a,
itempages b,
keygroupdata c
WHERE ((a.ItemType IN ('112','115','189','241'))
AND (a.ItemNum = b.ItemNum)
AND (b.ItemNum = c.ItemNum))
ORDER BY a.DateStored DESC

Are statistics gathered on these tables? If not, gathering statistics might change the execution plan, although it wouldn't necessarily be for the better.

Aside from that, look at the execution plan. You may see that it is joining the tables in a non-optimal order (e.g. it could be joining b and c before joining with a which has the filter condition).

You could use hints to try to affect the access paths, join order, or join method.

Update: Responding to the comment led me to this presentation, which might be helpful or at least interesting.

If you say that there are no indexes, then does this also mean that there are no primary or foreign keys defined? Obviously analysing the tables and gathering statistics are important but if metadata such as defining how the tables are supposed to be joined doesn't exist, then Oracle may well choose a poor execution path.

In that case using a hint such as /*+ ORDERED */ may well be the only option to make the optimizer reliably choose a good execution path. It may also be worth adding foreign keys and primary keys but define them as DISABLE and VALIDATE.

I guess the usefulness of this comment depends on how far the aversion to indexes goes so YMMV.

First create a view on this query and then generate a table from this view. Also create an index on date, make a job and schedule it in midnight time when the system is idle.

Well, since you can't create indexes, I'd make sure that statistics are all up-to-date then, I'd rewrite the query this way:

with a as (select /*+ MATERIALIZE */ ItemType, ItemNum, DateStored, ItemDate from items where ItemType in (112,115,189,241)) SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath FROM a, itempages b, keygroupdata c WHERE a.ItemNum = b.ItemNum AND b.ItemNum = c.ItemNum ORDER BY a.DateStored DESC

Remove the ORDER BY

perform the sort, after you pull the rows back to your application.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top