Eliminate duplicate records in file
Moderators: Frank Yaeger, Moderator Group
-
- Member
- Posts: 16
- Joined: Fri Apr 17, 2009 5:10 pm
Eliminate duplicate records in file
Hi,
I have eliminated duplicate records in a file using SUM FIELDS=NONE in SORT.
//SYSIN DD *
SORT FIELDS=(1,23,CH,A)
SUM FIELDS=NONE
But, I DONT want the records to get aligned in ascending order and then get duplicates eliminated. Is there any other way to remove dups without getting sorted in asc or desc order?
Eg.,
Input records:
MOHANK 123456789012345 RAJES
ANTAKS 123456789012345 MIRAJ
MOHANK 123456789012345 NANAK
Output records:
ANTAKS 123456789012345 MIRAJ
MOHANK 123456789012345 RAJES
I want MOHANK in the first place followed by ANTAKS. Pls help.
Thanks,
BanuPriya B
I have eliminated duplicate records in a file using SUM FIELDS=NONE in SORT.
//SYSIN DD *
SORT FIELDS=(1,23,CH,A)
SUM FIELDS=NONE
But, I DONT want the records to get aligned in ascending order and then get duplicates eliminated. Is there any other way to remove dups without getting sorted in asc or desc order?
Eg.,
Input records:
MOHANK 123456789012345 RAJES
ANTAKS 123456789012345 MIRAJ
MOHANK 123456789012345 NANAK
Output records:
ANTAKS 123456789012345 MIRAJ
MOHANK 123456789012345 RAJES
I want MOHANK in the first place followed by ANTAKS. Pls help.
Thanks,
BanuPriya B
-
- Moderator
- Posts: 1625
- Joined: Sat Aug 09, 2008 9:02 am
- Location: Mumbai, India
Probably you want to say, "I would like to keep the order of the input records when they are copied to the output file"... if so, eliminating duplicates, either by using SUM FIELDS=NONE or SELECT, requires sorting the records so that the records with the same key are in order.
If you need to keep the records in their original order, then you can use the trick of adding a sequence number before you eliminate the duplicates, and then sorting on that sequence number to get the remaining records back in their original order.
If you need to keep the records in their original order, then you can use the trick of adding a sequence number before you eliminate the duplicates, and then sorting on that sequence number to get the remaining records back in their original order.
Regards,
Anuj
Anuj
-
- Moderator
- Posts: 1625
- Joined: Sat Aug 09, 2008 9:02 am
- Location: Mumbai, India
Other suggestion which comes to mind is to use EQUALS. EQUALS tells the process to preserve the original order of the data within the "sort keys". Your site default is probably EQUALS. DFSORT is shipped with NOEQUALS as the default, but the site can change that to EQUALS.
If you're using DFSORT (and I'm not sure you are), you can see the value for EQUALS in message ICE128I ... it will have EQUALS=N or EQUALS=Y.
You can try using: to turn off EQUALS and see what you get.
If you're using DFSORT (and I'm not sure you are), you can see the value for EQUALS in message ICE128I ... it will have EQUALS=N or EQUALS=Y.
You can try using:
Code: Select all
OPTION NOEQUALS
Regards,
Anuj
Anuj
- Frank Yaeger
- Moderator
- Posts: 812
- Joined: Sat Feb 18, 2006 5:45 am
- Location: San Jose, CA
- Contact:
BanuPriya,
Here's a DFSORT/ICETOOL job that will do what you asked for. I assumed your input file has RECFM=FB and LRECL=80, but the job can be changed appropriately for other attributes.
Note that EQUALS will keep the duplicate records in their original order, but will not keep all of the records in their original order. For that, you need two passes over the data.
Here's a DFSORT/ICETOOL job that will do what you asked for. I assumed your input file has RECFM=FB and LRECL=80, but the job can be changed appropriately for other attributes.
Code: Select all
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=... input file (FB/80)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=... output file (FB/80)
//TOOLIN DD *
SORT FROM(IN) TO(T1) USING(CTL1)
SORT FROM(T1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
INREC OVERLAY=(81:SEQNUM,8,ZD)
SORT FIELDS=(1,23,CH,A),EQUALS
SUM FIELDS=NONE
/*
//CTL2CNTL DD *
SORT FIELDS=(81,8,ZD,A)
OUTREC BUILD=(1,80)
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
-
- Moderator
- Posts: 1625
- Joined: Sat Aug 09, 2008 9:02 am
- Location: Mumbai, India
-
- Member
- Posts: 16
- Joined: Fri Apr 17, 2009 5:10 pm
- Frank Yaeger
- Moderator
- Posts: 812
- Joined: Sat Feb 18, 2006 5:45 am
- Location: San Jose, CA
- Contact:
I don't know what you're asking for. If you have DFSORT, then you have ICETOOL. ICETOOL has been part of DFSORT since 1991! You said ICETOOL worked, so what is the problem?
If you don't want to use ICETOOL for some reason, then you can just use two DFSORT steps instead.
If you don't want to use ICETOOL for some reason, then you can just use two DFSORT steps instead.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort
-
- Member
- Posts: 25
- Joined: Tue Apr 28, 2009 10:53 pm
- Location: USA
- Contact:
SyncSort ships with ICETOOL as an alias to SYNCTOOL. If you prefer, as Frank suggested, you can code the following SyncSort job:banupriyab wrote:...dont we have any other means with normal SORT/SYNCSORT, because none of the jobs in our system uses ICETOOL.
Code: Select all
//STEP1 EXEC PGM=SORT
//SORTIN DD DSN=input.file
//SORTOUT DD DSN=&&TEMP
//SYSOUT DD SYSOUT=*
//SYSIN DD *
INREC OVERLAY=(81:SEQNUM,8,ZD)
SORT FIELDS=(1,23,CH,A),EQUALS
SUM FIELDS=NONE
/*
//STEP2 EXEC PGM=SORT
//SORTIN DD DSN=&&TEMP
//SORTOUT DD DSN=output.file
//SYSOUT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=(81,8,ZD,A)
OUTREC BUILD=(1,80)
/*
FREE TUTORIALS
Tutorials
Free tutorials from mainframegurukul
- JCL Tutorial
Covers all important JCL concepts. - Cobol Tutorial
This tutorials covers all Cobol Topics from STRING to COMP-3. - DB2 Tutorial
DB2 Tutorial focuses on DB2 COBOL Programming. - SORT Tutorial
This Tutorial covers all important aspects of DFSORT with examples - CICS Tutorial
This CICS tutorial covers CICS concepts and CICS Basics, CICS COBOL Programming.
Interview
Mainframe Interview questions
- Cobol Interview Questions
50+ Interview Questions - JCL Interview Questions
50+ Interview Questions - DB2 Interview Questions
100+ Interview Questions - CICS Interview Questions
70+ Interview Questions - VSAM Interview Questions
27 Interview Questions
Other References
Mainframe Tools and others
- XPEDITER Reference
Explains how we can debug a program - FILEAID Reference
Explains how to browse , edit and delete datasets - Change Man Reference
Quick Start tutorial on Changeman - Abend Reference
Important Abend codes explained - FaceBook Page
MainframeGurukul FaceBook Page - LinkedIn Page
MainframeGurkul Linkedin Page