Page 1 of 1

Merging Datasets with removing duplicates

Posted: Thu Dec 06, 2012 10:25 am
by Harshal_Chaudhari
Hi,

I have two datasets with the data as,

1st file:EIC009B
EIC011B
EIC013B
EIC014B
EIC028B
EIC032B

2nd file:APC065B
APC069B
EIC032B
CCC100B
CUC200B
EIC013B
EIC014B
CUC941B
DIC001B
DIC002B
I want to merge both these files into a third dataset and also it should be without any duplicates.So Please help me on this how to achieve this in a one step?

Posted: Thu Dec 06, 2012 12:44 pm
by William Collins
What is your key? RECFM and LRECL of files.

Posted: Thu Dec 06, 2012 4:04 pm
by Harshal_Chaudhari
The File parameters are (LRECL=10,BLKSIZE=23470,RECFM=FB)

Posted: Thu Dec 06, 2012 5:13 pm
by William Collins
And.... the... key... is...?

Posted: Thu Dec 06, 2012 8:16 pm
by Harshal_Chaudhari
About which key r u talking about did'n get you :roll:

Posted: Thu Dec 06, 2012 8:17 pm
by Harshal_Chaudhari
by the way i am not using any key

Posted: Thu Dec 06, 2012 9:37 pm
by William Collins
OK, you want to "merge" two different datasets with different numbers of records and drop the duplicates.

If you have "no key" it is either "impossible" or you use the whole record as a key.

Makes me wonder now what you mean by "merge".

You MERGE on files which are already in sequence on your key (even if it the whole record). As the first few bytes of your records were obviously not in sequence, I presumed (and certainly wasn't going to spend time checking) that you had some other key that was in sequence.

So, it really sounds like you want to SORT, on the entire record, with SUM FIELDS=NONE? Concatenate your input datasets on SORTIN. Give it a whirl.

Posted: Fri Dec 07, 2012 1:56 am
by DikDude
I suspect what will work is a concatenated sort that removes duplicates?

As for the key, i suspect it is all of the data shown (which may or may not be all of the data in the records.
by the way i am not using any key
If you intend to remove duplicates, you will . . .

Posted: Fri Dec 07, 2012 2:21 am
by NicC
Whatever - why is this in the JCL forum? You cannot do this with JCL - except use the JCL to run a program (probably a sort program) to do what you want. If you think it can be done with JCL then look in the JCL manual.

Posted: Fri Dec 07, 2012 2:31 am
by DikDude
Because several forums support sort dialog in their JCL area?

Posted: Fri Dec 07, 2012 9:08 am
by Harshal_Chaudhari
For ur understanding my 3rd dataset after merging and removing duplicates will be,

APC069B
CCC100B
CUC200B
EIC032B
EIC009B
EIC011B
EIC013B
EIC014B
EIC028B
EIC032B
CUC941B
DIC001B
DIC002B

I want this kind of data in my output dataset
Hope you get that
Thanx in advance

Reagards,
Harshal
Thanx in advance

Posted: Fri Dec 07, 2012 2:19 pm
by William Collins
Can you look at all the input records on both your input files and the expected output you have shown and EXACTLY state how you get from one to another.

Posted: Fri Feb 08, 2013 5:16 pm
by Srilakshmi
Hi Harsh,

Use Sortcard as below

//SYSIN DD *
SORT FIELDS=(1,10,CH,A)
SUM FIELDS=(NONE,XSUM)

If you want the duplicate records written into some other file

Write //SORTXSUM DD DSN=FILENAME after the Outfile step

Try this :)

Posted: Fri Feb 08, 2013 6:16 pm
by William Collins
Did you look at the output order required? What if TS has DFSORT?

Posted: Fri Feb 08, 2013 6:41 pm
by Srilakshmi
Oops output order is in Alphabetic way. Sorry! then I think it will be helpful if we can add a SEQNUM at the end of each record, remove duplicates and then sort it back using the SEQNUM.