Domanda

Hi I have an Ab initio graph that after some data manipulation it loads them into a table. I am looking for some sort of validation component to end the process (before loading the data into the table) if it found duplicate rows.

The duplicate rows will have a unique ID but maybe I could ignore that column/part-of-the-record.

È stato utile?

Soluzione

Pass the flow to dedup component.

In Dedup component, select unique property for output. This will give you all the unique records.

Now in case you have duplicate records, it will go thru the dup port. You can collect those record(s) in a intermediate file (for auditing purpose) and the process the graph as per your requirement.

In case you want to abort the process just after finding all the duplicates, you can abort the process using the phasing.

Also in case you don't want to have the records inserted in DB, if the input has duplicate records then you can just pass the key part to Dedup. It will make the processing faster.

Altri suggerimenti

Incase you want to keep processing while also handling the error scenario, one of the best way to do it is via the use of write_to_log() function within AbInitio. Though the use of this component needs to be done judiciously, as its a memory gulping finction.

Create two graphs. Graph 1. Put a dedup sort and pass the records. Collect the duplicate records to a file. Now check for that file count in end script of that graph. If count is 0 call graph 2.Otherwise fail the graph.

Graph 2: update the table with output of dedup component.

You can handle this scenario in 2 ways

  1. At database level

    If your table has constraints, then simply use the following properties of the TABLE Component

    a. ignoreDuplicates
    b. reject-threshold
    
  2. At graph level

    Take a dedup component, attach the dedup port to a reformat and within reformat use force_error function.

    Important Note for #2: It will be good if you keep the phase of table component higher than the reformat component (containing the force_error), so that in case of a failure you will be 100% sure that there is no impact on the table data.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top