Question

Don't we hate when evil coding comes back to haunt?

Some time ago I needed to generate a string concatenating some fields for some more processing later. I thought it would be a good idea to do if straight in the query, and used SO's help to get it. It worked. For a while...

The table got to big and now that trick (which I know is super inefficient) is not exactly viable. This what I'm doing:

with my_tabe as
(
    select 'user1' as usrid, '1' as prodcode from dual union
    select 'user1' as usrid, '2' as prodcode from dual union
    select 'user1' as usrid, '3' as prodcode from dual union
    select 'user2' as usrid, '2' as prodcode from dual union
    select 'user2' as usrid, '3' as prodcode from dual union
    select 'user2' as usrid, '4' as prodcode from dual
)
select
    usrid,
    ltrim(sys_connect_by_path(prodcode, '|'), '|') as prodcode
from 
    (
    select distinct prodcode, usrid,count(1)
    over (partition by usrid) as cnt,
    row_number() over (partition by usrid order by prodcode) as rn
    from my_tabe 
    )
where
    rn = cnt 
start with rn = 1
connect by prior rn + 1 = rn
and prior usrid = usrid

Which nicely yields:

USRID   PRODCODE
user1   1|2|3
user2   2|3|4

The evil thing in here, as you might have noticed, is the where rn = cnt, which if you remove you'll see all the work (I suppose) Oracle is really doing:

USRID   PRODCODE
user1   1
user1   1|2
user1   1|2|3
user2   2
user2   2|3
user2   2|3|4

I'm actually using this in many places where I have not so many records. It is quite fine up to about a half million records.

Recently I tried the same in a table with ~15Mi records, and well... no good.

Question: is there a way to do this more efficiently on Oracle or is it time bring it down to the actual code? This is not actual core issue, so I can still afford kludging, as long as it's fast... Worth mentioning there's a index for the column "usrid" I'm using.

cheers,

Was it helpful?

Solution

Tom Kyte provides a very convenient way to do that, and it works from Oracle 9i, with a custom aggregation function. It aggregates with commas, but you can modify the function body for pipes.

Starting with Oracle 11g, you can do:

SELECT LISTAGG(column, separator) WITHIN GROUP (ORDER BY field)
  FROM dataSource
 GROUP BY grouping columns

This web page provides additional methods including the one that you listed and which is indeed not really efficient.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top