Question

I have two tables with a many-to-one relationship. (Oracle)

**Table: PARENT**
Field: A (PK)
Field: B
Field: C1
Field: C2
Field: C3
Field: C4
Field: C5

**Table CHILD**
Field: A (PK) (FK to PARENT.A)
Field: D (PK)
Field: E

Records inserted into the child and parent tables are inserted at the same time.

What I am wondering is what is the most efficient way to do this.

Currently there is one stored procedure that is called by the calling application for this whole insert step. The stored procedure currently has the following signiture:

Field: A
Field: B
Field: C (dilimited string)
Field: D (dilimited string)
Field: E (dilimited string)

The procedure loops through C from the input and stores each of the values in an array. Then uses that array along with A and B from the input to insert into table PARENT.

Then the uses A from the input and loops through D and E from the input and makes an insert into table CHILD for each item in the dilimited strings.

This will be called up to 3 million times per day. It needs to be as efficient as possible.

How much efficency is lost when making multiple SP calls rather than just one?

All the looping to take care of the dilimited strings seem like a lot of work!

I was thinking that the calling application could make seperate SP calls for each of the entries into CHILD. However, how can I ensure that somehow the insert into CHILD wont happen before the insert into PARENT... and that would be more stored procedure calls. (many times there is no child records to insert, when there is there is generally less than 10, but could be as many as 50)

I also open to another way to get the information in dilimted string C.

Is there something more efficent than a while loop to get the information from dilimited strings?

I didn't write the SP I was asked to make a small modification and make it more efficent if possible.

Any ideas?

Notes:

I simplified the tables, there are actually 10 items in dilitied string C not 5, there are also two more dilited strings just like C that get inserted into table PARENT. The tables also have several more fields than shown

Records are deleted after 30 days.

Was it helpful?

Solution

There are a couple of things here....

First, if you're looping through a delimited string to put similar items in similar columns, you likely need to refactor your tables to be more normalized. For example, if C is a delimited list of phone numbers, and columns C1 - C5 are phone1 - phone5, you should likely have a separate child phone table. This is dependant on use case, but strikes me as a potential future problem (ie knowledge of the delimiter). If the delimited string has dis-similar data in it (phone number, city, name, etc) - please make separate entry paramets for each separate data element. This is an even bigger potential problem (because if the order of the entries is important, you're screwed, period).

You're right, looping over delimited strings is a lot of work. There's not necessarily a better way to do it, unless possibly if your RDBMS has some sort of built-in split function or something (or you can call an external function). I'd rather avoid it if I could, and call a child stored procedure. It's going to have to be per-child, but that's actually better anyways - not necessarily performance wise, but for conceptualizing it, and future maintanence.

How to prevent a child row from being inserted without a parent? Use a foreign key constraint. If the constraint is violated, it's the caller's fault, and not the DB's.

Okay, some of these require some major changes to the SP, and some require changes to the underlying table structure (which, if it was initially set up correctly, should be mostly transparent to the users). But this is what I would have attempted...

Oh, and pease tell me this is all running under commitment control...

OTHER TIPS

The most efficient way to handle this is likely to use user-defined data types to past the lists of values to the stored procedure as arrays, rather than as delimited strings. Something like this:

CREATE TYPE r_child IS OBJECT
   (a NUMBER, d VARCHAR2(20), e VARCHAR2(20));

CREATE TYPE nt_child AS TABLE OF r_child;

CREATE TYPE nt_c AS TABLE OF VARCHAR2(20);

CREATE PROCEDURE insert_data(
   p_a NUMBER, p_b VARCHAR2, p_c nt_c,
   p_child nt_child
) AS
   v_parent   parent%ROWTYPE;
   i          NUMBER;
BEGIN
   v_parent.a   := p_a;
   v_parent.b   := p_a;

   FOR i IN p_c.FIRST .. p_c.LAST LOOP
      CASE i
         WHEN 1 THEN
            v_parent.c1   := p_c(i);
         WHEN 2 THEN
            v_parent.c2   := p_c(i);
         WHEN 3 THEN
            v_parent.c3   := p_c(i);
         WHEN 4 THEN
            v_parent.c4   := p_c(i);
         WHEN 5 THEN
            v_parent.c5   := p_c(i);
      END CASE;
   END LOOP;

   INSERT INTO parent(
                         a,
                         b,
                         c1,
                         c2,
                         c3,
                         c4,
                         c5
              )
   VALUES     v_parent;

   FORALL i IN p_child.FIRST .. p_child.LAST
      INSERT INTO child(
                           a, d, e
                 )
      VALUES     (
                     p_a, p_child(i).d, p_child(i).e
                 );
END insert_data;

Wrapping the inserts in a package won't add significant execution time if the package is designed well. That includes using bulk inserts (as shown where I used forall) and using data structures that the database can read naturally, rather than encoding and decoding the data (as you're currently doing with the delimited values).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top