DFSORT selecting duplicates when looking for only the first duplicate
Vra
The below JCL should select the first duplicate of each record, keeping them in the same order because of "OPTION COPY" and only with the 'NETWORK' at byte 4 length 7 and '.' at byte 59 length 1, excluding records with 'TOTAL' at byte 3 length 5 and 'GRAND' at byte 3 length 5.
It shows any record with 'NETWORK' at byte 4 length 7
//SORT EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DISP=SHR,DSN=INPUT.FILE
//T1 DD DSN=&&T1,DISP=(MOD,PASS),SPACE=(TRK,(5,5))
//OUT DD SYSOUT=*
//OUTFIL DD SYSOUT=*
//TOOLIN DD *
* DROP EVERYTHING WE DON'T WANT
SELECT FROM(IN) TO(OUT) ON(1,134,CH) USING(CTL1) FIRST
/*
//CTL1CNTL DD *
OPTION COPY
INCLUDE COND=((4,7,CH,EQ,C'NETWORK',OR,
59,1,CH,EQ,C'.'),AND,
(3,5,CH,NE,C'TOTAL',AND,
3,5,CH,NE,C'GRAND'))
/*
If I change it the conditions for only 'NETWORK' at byte 4 length 7 it only shows 1 record, which is what I expect. The input is the same each time.
//CTL1CNTL DD *
OPTION COPY
INCLUDE COND=((4,7,CH,EQ,C'NETWORK'))
/*
I can't figure out what the difference is that causes the other conditions to change it so it has duplicates
2 of the comments have suggested that the issue is with the include conditions.
I have tried the below, the first select is doing what I was doing original and the second SELECT is without the include conditions because they have already happened in the first select. There are still duplicate records with NETWORK at byte 4 length 7. The rest of the record with NETWORK are the exact same so there should only be 1.
//TOOLIN DD *
* DROP EVERYTHING WE DON'T WANT
SELECT FROM(IN) TO(T1) ON(1,133,CH) USING(CTL1) FIRST
SELECT FROM(T1) TO(OUT) ON(1,133,CH) USING(CTL2) FIRST
/*
//CTL1CNTL DD *
OPTION COPY
INCLUDE COND=((4,7,CH,EQ,C'NETWORK',OR,
59,1,CH,EQ,C'.'),AND,
(3,5,CH,NE,C'TOTAL',AND,
3,5,CH,NE,C'GRAND'))
/*
//CTL2CNTL DD *
OPTION COPY
/*
Oplossing
The SELECT FIRST operator expects the input to be sorted, which it does before checking for duplicates once you don't specify "OPTION COPY"
I wanted to remove the duplicates and keep it in input order.
The below does it by adding a sequence number that allows the temp file to be sorted back to input order
//TOOLIN DD *
* SELECT REMOVING THE DUPLICATES AND ONLY INCLUDING THE FIELDS WANTED
* TO TEMP DD T1
SELECT FROM(IN) TO(T1) ON(1,133,CH) USING(CTL1) FIRST
* COPY FROM TEMP DD T1 TO DD OUT USING CTL2 STATEMENTS
COPY FROM(T1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
INCLUDE COND=((4,7,CH,EQ,C'NETWORK',OR,
59,1,CH,EQ,C'.'),AND,
(3,5,CH,NE,C'TOTAL',AND,
3,5,CH,NE,C'GRAND'))
* ADD SEQUENCE NUMBER 8 NUMBERS LONG TYPE SIGNED ZONED DECIMAL AT THE
* END OF EACH RECORD
INREC OVERLAY=(134:SEQNUM,8,ZD)
/*
//CTL2CNTL DD *
* SORT ON THE SEQUENCE NUMBER WHICH PUTS THE RECORDS BACK IN INPUT
* ORDER
SORT FIELDS=(134,8,CH,A)
/*