Question

I have a DB table in which each row has a randomly generated primary key, a message and a user. Each user has about 10-100 messages but there are 10k-50k users.

I write the messages daily for each user in one go. I want to throw away the old messages for each user before writing the new ones to keep the table as small as possible.

Right now I effectively do this:

delete from table where user='mk'

Then write all the messages for that user. I'm seeing a lot of contention because I have lots of threads doing this at the same time.

I do have an additional requirement to retain the most recent set of messages for each user.

I don't have access to the DB directly. I'm trying to guess at the problem based on some second hand feedback. The reason I'm focusing on this scenario is that the delete query is showing a lot of wait time (again - to the best of my knowledge) plus it's a newly added bit of functionality.

Can anyone offer any advice?

Would it be better to:

select key from table where user='mk'

Then delete individual rows from there? I'm thinking that might lead to less brutal locking.

Was it helpful?

Solution

No, it is always better to perform a single SQL statement on a set of rows than a series of "row-by-row" (or what Tom Kyte calls "slow-by-slow") operations. When you say you are "seeing a lot of contention", what are you seeing exactly? An obvious question: is column USER indexed?

(Of course, the column name can't really be USER in an Oracle database, since it is a reserved word!)

EDIT: You have said that column USER is not indexed. This means that each delete will involve a full table scan of up to 50K*100 = 5 million rows (or at best 10K * 10 = 100,000 rows) to delete a mere 10-100 rows. Adding an index on USER may solve your problems.

OTHER TIPS

If you do this everyday for every user, why not just delete every record from the table in a single statement? Or even

truncate table whatever reuse storage
/

edit

The reason why I suggest this approach is that the process looks like a daily batch upload of user messages preceded by a clearing out of the old messages. That is, the business rules seems to me to be "the table will hold only one day's worth of messages for any given user". If this process is done for every user then a single operation would be the most efficient.

However, if users do not get a fresh set of messages each day and there is a subsidiary rule which requires us to retain the most recent set of messages for each user then zapping the entire table would be wrong.

Are you sure you're seeing lock contention? It seems more likely that you're seeing disk contention due to too many concurrent (but unrelated updates). The solution to that is simply to reduce the number of threads you're using: Less disk contention will mean higher total throughput.

I think you need to define your requirements a bit clearer...

For instance. If you know all of the users who you want to write messages for, insert the IDs into a temp table, index it on ID and batch delete. Then the threads you are firing off are doing two things. Write the ID of the user to a temp table, Write the message to another temp table. Then when the threads have finished executing, the main thread should

DELETE * FROM Messages INNER JOIN TEMP_MEMBERS ON ID = TEMP_ID

INSERT INTO MESSAGES SELECT * FROM TEMP_messges

im not familiar with Oracle syntax, but that is the way i would approach it IF the users messages are all done in rapid succession.

Hope this helps

TALK TO YOUR DBA

He is there to help you. When we DBAs take access away from the developers for something such as this, it is assumed we will provide the support for you for that task. If your code is taking too long to complete and that time appears to be tied up in the database, your DBA will be able to look at exactly what is going on and offer suggestions or possibly even solve the problem without you changing anything.

Just glancing over your problem statement, it doesn't appear you'd be looking at contention issues, but I don't know anything about your underlying structure.

Really, talk to your DBA. He will probably enjoy looking at something fun instead of planning the latest CPU deployment.

This might speed things up:

Create a lookup table:

create table rowid_table (row_id ROWID ,user VARCHAR2(100));
create index rowid_table_ix1 on rowid_table (user);

Run a nightly job:

truncate table rowid_table;
insert /*+ append */ into rowid_table
select ROWID row_id , user
from table;
dbms_stats.gather_table_stats('SCHEMAOWNER','ROWID_TABLE');

Then when deleting the records:

delete from table
where ROWID IN (select row_id
                from rowid_table
                where user = 'mk');

Your own suggestion seems very sensible. Locking in small batches has two advantages:

  • the transactions will be smaller
  • locking will be limited to only a few rows at a time

Locking in batches should be a big improvement.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top