Compare 2 files by writing all matching records to output

In this Mainframe Forum - You can post your queries on DFSORT, ICETOOL , SyncSort & JCL Utilities

Moderators: Frank Yaeger, Moderator Group

Post Reply
nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files by writing all matching records to output

Post by nachi » Wed Mar 25, 2009 5:29 pm

I need to compare 2 files having data as follows
File 1
AAA1000
BBB2000
AAA1000

File 2
AAA1000
BBB2000
AAA1000

Both the files are FB with LRECL 80

Toolin and Control cards:

//TOOLIN DD *
COPY FROM(IN1) TO(T1) USING(CTL1)
COPY FROM(IN2) TO(T2) USING(CTL2)
SPLICE FROM(CONCT) TO(OUT12) ON(1,9,CH) WITH(82,1) -
WITHALL USING(CTL3) KEEPNODUPS

//CTL1CNTL DD *
OUTREC FIELDS=(1,80,81:C'11')
/*
//CTL2CNTL DD *
OUTREC FIELDS=(1,80,81:C'22')
/*
//CTL3CNTL DD *
OUTFIL FNAMES=OUT12,INCLUDE=(81,2,CH,EQ,C'12'),OUTREC=(1,80)
OUTFIL FNAMES=OUT1,INCLUDE=(81,2,CH,EQ,C'11'),OUTREC=(1,80)
OUTFIL FNAMES=OUT2,INCLUDE=(81,2,CH,EQ,C'22'),OUTREC=(1,80)

I am getting output as
OUT12 - 4 records
OUT1 - 1 record
OUT2 - no records

I need the output to be OUT12 - 3 records and OUT1 and OUT2 as 0

How can this be done? Need help!

Thanks,
Nachi

nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files matching and writing all the records to op

Post by nachi » Wed Mar 25, 2009 5:34 pm

Correction:

OUT12 - 3 records
OUT1 - 1 record
OUT2 - no records

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Wed Mar 25, 2009 9:36 pm

You need to explain what you want to do more clearly.

Are you comparing record 1 of file1 to record 1 of file2, record 2 of file1 to record 2 of file2, etc or are you comparing the records in the two files in some other way?

Which fields do you want to compare the records on (the entire record? positions 1-3? positions 4-7? or what?).
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files by writing all matching records to output

Post by nachi » Thu Mar 26, 2009 12:06 am

Frank,

I need to make sure whether input File1 and input File2 contain same data. (but they can be in different order).

To be more clear, I have cut pasted my sample JCL
//STEP050N EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD *
NACHI
FRANK
NACHI
//IN2 DD *
FRANK
NACHI
NACHI
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5),RLSE),
// DISP=(MOD,PASS),DCB=(LRECL=357)
//T2 DD DSN=&&T2,UNIT=SYSDA,SPACE=(CYL,(5,5),RLSE),
// DISP=(MOD,PASS),DCB=(LRECL=357)
//CONCT DD DSN=*.T1,VOL=REF=*.T1,DISP=(OLD,DELETE)
// DD DSN=*.T2,VOL=REF=*.T2,DISP=(OLD,DELETE)
//OUT1 DD DSN=<OUTFILE1>,
// DISP=(NEW,CATLG,DELETE),DCB=(LRECL=80),
// UNIT=SYSDA,SPACE=(CYL,(100,10),RLSE)
//OUT2 DD DSN=<OUTFILE2>,
// DISP=(NEW,CATLG,DELETE),DCB=(LRECL=80),
// UNIT=SYSDA,SPACE=(CYL,(100,10),RLSE)
//OUT12 DD DSN=<OUTFILE3>,
// DISP=(NEW,CATLG,DELETE),DCB=(LRECL=80),
// UNIT=SYSDA,SPACE=(CYL,(100,10),RLSE)
//TOOLIN DD *
COPY FROM(IN1) TO(T1) USING(CTL1)
COPY FROM(IN2) TO(T2) USING(CTL2)
SPLICE FROM(CONCT) TO(OUT12) ON(1,10,CH) -
WITHALL WITH(12,1) USING(CTL3) KEEPNODUPS
/*
//CTL1CNTL DD *
INREC FIELDS=(1,10,11:C'11')
/*
//CTL2CNTL DD *
INREC FIELDS=(1,10,11:C'22')
/*
//CTL3CNTL DD *
OUTFIL FNAMES=OUT12,OUTREC=(1,80)
OUTFIL FNAMES=OUT1,INCLUDE=(11,2,CH,EQ,C'11'),OUTREC=(1,80)
OUTFIL FNAMES=OUT2,INCLUDE=(11,2,CH,EQ,C'22'),OUTREC=(1,80)
/*


In short, I want the output OUT12 to contain
NACHI 12
FRANK 12
NACHI 12

where as, I am getting
FRANK 12
NACHI 11
NACHI 12
NACHI 12

and OUTPUT OUT1 and OUT2 to contain 0 records where as I am getting OUT1 one record and OUT2 no record

Please let me know whether the info provided is sufficient enough to help me!

Thanks,
Nachi

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Mar 26, 2009 12:35 am

Please let me know whether the info provided is sufficient enough to help me!
No, you basically repeated what you said the first time which didn't explain what you want. And you didn't answer the questions I asked in my first post.

You show the following for input:

//IN1 DD *
NACHI
FRANK
NACHI
//IN2 DD *
FRANK
NACHI
NACHI

If I go line by line in file1 and file2, the first and second lines are different and the third line is the same. Yet you say the expected output is:

NACHI 12
FRANK 12
NACHI 12

So how am I supposed to determine what you want from that? You need to explain the "rules" clearly. If you aren't comparing line by line, then explain what exactly you are comparing.

It doesn't help to show me YOUR JCL. What you need to do is explain to me what you're trying to do and answer my questions in my previous post.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files by writing all matching records to output

Post by nachi » Thu Mar 26, 2009 12:52 am

Frank,

I am trying to compare 2 files whether they contain the same data. And if they contain the same data, I want all the match records without eliminating the duplicates to be written to an output file. If there are 3 records in both input files, I need all 3 records in the output.

Post 1 reply:
1. Are you comparing record 1 of file1 to record 1 of file2, record 2 of file1 to record 2 of file2, etc or are you comparing the records in the two files in some other way?

No. I want the records to be sorted in both the files without eliminating the duplicates. And then compare record1 of file1 to record 1 of file 2.

2. Which fields do you want to compare the records on (the entire record? positions 1-3? positions 4-7? or what?).
I want to compare between 1 thru 10

Please let me know if you need more info

Thanks,

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Mar 26, 2009 1:44 am

I want the records to be sorted in both the files without eliminating the duplicates. And then compare record1 of file1 to record 1 of file 2.
Well, that's certainly an important piece of the puzzle that you didn't mention before. But I suspect your example/rules don't really cover all of the possibilities.

In your input examples, you show the exact same number of duplicates for each key in each file (e.g. two instances of AAA1000 and one instance of BBB2000 in both files for your first example, and two instances of NACHI and one instance of FRANK in both files for your second example). Is that always the case? Or can you have different numbers of duplicates for a key in each file and/or non-duplicates. For example, could you have something like this?

Code: Select all

File1
AAA1000   1
AAA1000   2
CCC3000   3
CCC3000   4
AAA1000   5
BBB2000   6

File2
AAA1000   7
CCC3000   8
AAA1000   9
DDD2000   10
DDD2000   11
CCC3000   12
CCC3000   13

Note that there are 3 instances of AAA1000 in file1 and 2 in file2, one instance of BBB2000 in file1 and none in file2, two instances of CCC3000 in file1 and 3 in file2, and 2 instances of DDD2000 in file1 and none in file2.

If I sort each file by 1,10, I would get:

Code: Select all

File1
AAA1000   1
AAA1000   2
AAA1000   5
BBB2000   6
CCC3000   3
CCC3000   4

File2
AAA1000   7
AAA1000   9
CCC3000   8
CCC3000   12
CCC3000   13
DDD2000   10
DDD2000   11
Now how would I apply your rule of
I want the records to be sorted in both the files without eliminating the duplicates. And then compare record1 of file1 to record 1 of file 2
to these records and what would you expect for output and why?
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files by writing all matching records to output

Post by nachi » Thu Mar 26, 2009 2:28 am

Frank,

Yes. that is correct that the input files can have different number of duplicates and non-duplicates.

Sorted output from your example:
File1
AAA1000 1
AAA1000 2
AAA1000 5
BBB2000 6
CCC3000 3
CCC3000 4

File2
AAA1000 7
AAA1000 9
CCC3000 8
CCC3000 12
CCC3000 13
DDD2000 10
DDD2000 11
This scenario is possible and in this case, I want the output something like this

OUT12 (match between both files including duplicates)
AAA1000
AAA1000
CCC3000
CCC3000
OUT1 (records that are present in file 1 do not match with file 2 )
AAA1000
BBB2000
OUT2 (records that are present in file 2 do not match with file 1)
CCC3000
DDD2000
DDD2000
The reason why I need this way, is the matching record file (OUT12) is fed as an input to another job which sums up money amounts and does a lot of other functionalities. And in theory there should not be any differences between the input files and if there are any, they should be captured and investigated whether they are valid differences or not.

Did I give you a clear picture? Please let me know if you need some more info?

Thanks,
Nachi

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Mar 26, 2009 4:07 am

Ok, now it's clear. Here's a DFSORT/ICETOOL job that will do what you asked for:

Code: Select all

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN1 DD DSN=...  input file1 &#40;FB/80&#41;
//IN2 DD DSN=...  input file2 &#40;FB/80&#41;
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=&#40;CYL,&#40;5,5&#41;,RLSE&#41;,
// DISP=&#40;MOD,PASS&#41;
//OUT12 DD DSN=...  output file12 &#40;FB/10&#41;
//OUT1  DD DSN=...  output file1  &#40;FB/10&#41;
//OUT2  DD DSN=...  output file2  &#40;FB/10&#41;
//TOOLIN DD *
SORT FROM&#40;IN1&#41; TO&#40;T1&#41; USING&#40;CTL1&#41;
SORT FROM&#40;IN2&#41; TO&#40;T1&#41; USING&#40;CTL2&#41;
SPLICE FROM&#40;T1&#41; TO&#40;OUT12&#41; ON&#40;1,10,CH&#41; ON&#40;13,8,ZD&#41; -
  KEEPNODUPS WITHALL WITH&#40;11,1&#41; USING&#40;CTL3&#41;
/*
//CTL1CNTL DD *
  SORT FIELDS=&#40;1,10,CH,A&#41;
  OUTREC BUILD=&#40;1,10,11&#58;C'BB',13&#58;SEQNUM,8,ZD,RESTART=&#40;1,10&#41;&#41;
/*
//CTL2CNTL DD *
  SORT FIELDS=&#40;1,10,CH,A&#41;
  OUTREC BUILD=&#40;1,10,11&#58;C'VV',13&#58;SEQNUM,8,ZD,RESTART=&#40;1,10&#41;&#41;
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=OUT12,INCLUDE=&#40;11,2,CH,EQ,C'VB'&#41;,BUILD=&#40;1,10&#41;
  OUTFIL FNAMES=OUT1,INCLUDE=&#40;11,2,CH,EQ,C'BB'&#41;,BUILD=&#40;1,10&#41;
  OUTFIL FNAMES=OUT2,INCLUDE=&#40;11,2,CH,EQ,C'VV'&#41;,BUILD=&#40;1,10&#41;
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files by writing all matching records to output

Post by nachi » Thu Mar 26, 2009 7:30 pm

Frank,

That worked! Thanks a lot for your help!

RESTART is the key here! If you don't have restart, its the same way what I did earlier.

Thanks,
Nachi

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Mar 26, 2009 8:04 pm

If you don't have restart, its the same way what I did earlier.
Except that I also used SORTs instead of COPYs (SORT followed by OUTREC is needed so the sequence numbers are applied to the sorted records).

And I used one T1 MOD data set instead of T1 and T2 with concatenation. Note that using T1 and T2 with concatenation can result in data loss due to the system restriction described in the second bullet on this page:

http://publibz.boulder.ibm.com/cgi-bin/ ... 1007&CASE=
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

nachi
Member
Posts: 22
Joined: Wed Mar 25, 2009 5:16 pm

Compare 2 files by writing all matching records to output

Post by nachi » Thu Mar 26, 2009 8:38 pm

Frank,

Very useful information. I agree with you that Concatenated files may result in loss of data. Also, SORT is used instead of COPY.

Thanks again for all your help!

- Nachi

Post Reply

FREE TUTORIALS

Tutorials
Free tutorials from mainframegurukul
  • JCL Tutorial
    Covers all important JCL concepts.
  • Cobol Tutorial
    This tutorials covers all Cobol Topics from STRING to COMP-3.
  • DB2 Tutorial
    DB2 Tutorial focuses on DB2 COBOL Programming.
  • SORT Tutorial
    This Tutorial covers all important aspects of DFSORT with examples
  • CICS Tutorial
    This CICS tutorial covers CICS concepts and CICS Basics, CICS COBOL Programming.
Interview
Mainframe Interview questions



Other References
Mainframe Tools and others