comparing two flat files

In this Mainframe Forum - You can post your queries on DFSORT, ICETOOL , SyncSort & JCL Utilities

Moderators: Frank Yaeger, Moderator Group

Post Reply
abhishek.pradhan
Member
Posts: 11
Joined: Wed Nov 08, 2006 4:57 pm

comparing two flat files

Post by abhishek.pradhan » Thu Nov 09, 2006 11:32 am

Hi,

I want to compare two files for which the senerio is:

The comapre field in the first file is at col.no 13 and in the other file its on the col.no 23.(both have same length and same format)

Now, I want to compare the two file based on these field and remove the duplicate records.

Is this possible with sort?

-Regards
Abhishek

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Nov 09, 2006 9:49 pm

It's not clear what exactly you want to do. What is the RECFM and LRECL of each input file? What is the length and format of the field in each input file you want to compare? Does input file1 have duplicates within it? Does input file2 have duplicates within it? Please show an example of the records in each input file and what you expect for output.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

abhishek.pradhan
Member
Posts: 11
Joined: Wed Nov 08, 2006 4:57 pm

Post by abhishek.pradhan » Fri Nov 10, 2006 1:06 am

Frank,

please find the answeres to your queries.

1. What is the RECFM and LRECL of each input file?
LRECL for FILEA is 478 and the for the other is 435.RECFM is FB for both the files.

2.What is the length and format of the field in each input file you want to compare?
Format is ZD and the length is 14.

3. Does input file1 have duplicates within it? MAY HAVE
4. Does input file2 have duplicates within it? NO

5.Please show an example of the records in each input file and what you expect for output.


FILEA
---------1--------2---------3---------4---------5---------6---------7---
cols:1-12 PD
cols:13-26 ZD
cols:27-478(includes different fields)

FILEB
---------1--------2---------3---------4---------5---------6---------7---
cols:1-12 PD
cols:13-22 ZD
cols:23-37 ZD
and rest as above..

I won't be able to provide data at this time because m not having mainframe connectivity at home. will provide you by late night.

Comparision has to done on cols:13-26(FILEA) & cols:23-37(FILEB)

please let me know if you want any information.

-Abhi

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Fri Nov 10, 2006 1:35 am

Now, I want to compare the two file based on these field and remove the duplicate records.


This is a rather ambiguous statement. For the output, do you want records from file1 only, or records from file2 only, or records from both files? Do you want to keep one record for each key (from which file?), or do you want to eliminate all records that have a match (from both files), or what?

I wanted to see an example of your input records and output records to determine what you actually want to do. A conceptual example like this will do:

Code: Select all

File 1
F1R1  01
F1R2  01
F1R3  05
F1R4  10
F1R5  10
F1R6  10
F1R7  15
F1R8  20
F1R9  20

File2
F2R1  01
F2R2  10
F2R3  15
F2R4  18

Output
???
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

abhishek.pradhan
Member
Posts: 11
Joined: Wed Nov 08, 2006 4:57 pm

Post by abhishek.pradhan » Fri Nov 10, 2006 11:29 am

Code:

File 1

Code: Select all

a/c   loc  name age add
------------------------
F1R1  01 
F1R1  01 
F1R1  05 
F1R1  10 
F1R1  10 
F1R1  10 
F1R1  15 
F1R1  20 
F1R1  20 
File2

Code: Select all

a/c   status  loc
-------------
F1R1  S1      01 
F1R1  S2      10 
F1R1  S3      15 
F1R1  S4      18 
Output

Code: Select all

a/c   loc name age add
-----------------------
F1R1  05 
F1R1  20  
FILEA
=COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
000001 ?? 01200607212014190418112064 Kincumber 124
000002 ?? 15200607212137080418112064 Chatswood 122
000003 ?? 01200607212138230418112064 NL Chatswood 947
000004 ?? 01200607212139060418112064 NL Chatswood 941
000005 ?? 01200607212139400418112064 NL Chatswood 941
000006 ?? 15200607212139590418112064 Chatswood 122

FILEB
=COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
000001 ?? 2006082H O0120060721201419 041811206412488 Kincu
000002 ?? 2006082H O0120060721213823 04181120640294778400 Chats
000003 ?? 2006082H O0120060721213906 04181120640294118400 Chats
000004 ?? 2006082H O0120060721213940 04181120640294118400 Chats
000005 ?? 2006082H O0120060721214055 04181120640294118000 Chats
000006 ?? 2006082H O1520060721213708 04181120641223 Chats
000007 ?? 2006082H O1520060721213959 04181120641223 Chats


The data marked in red is the key field.
Now, I want that if the key from FILEB matches key in FILEA the it should be eliminated and if it does not matches, the record should be written to the output file(the output record should be in the format of FILEA)

NOTE:
FILEA LRECL=478, FILEB LRECL=435
RECFM is same for both files.

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Fri Nov 10, 2006 11:00 pm

There are two 20 records in file1, but you only show one 20 record in the output file, so I assumed you only want one record with each key in the output file. I also assumed that the key in file1 starts in position 13 and the key in file2 starts in position 23 as you said in your first post. Given that, here's a DFSORT/ICETOOL job that will do what you asked for:

Code: Select all

//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD DSN=...  input file1 (FB/478)
//IN2 DD DSN=...  input file2 (FB/435)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD DSN=...  output file (FB/478)
//TOOLIN DD *
COPY FROM(IN2) TO(T1) USING(CTL1)
COPY FROM(IN1) TO(T1) USING(CTL2)
SPLICE FROM(T1) TO(OUT) ON(13,14,CH) -
  KEEPNODUPS WITH(1,479) USING(CTL3)
/*
//CTL1CNTL DD *
  INREC BUILD=(13:23,14,479:C'BB')
/*
//CTL2CNTL DD *
  INREC OVERLAY=(479:C'VV')
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=OUT,INCLUDE=(479,2,CH,EQ,C'VV'),
    BUILD=(1,478)
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

abhishek.pradhan
Member
Posts: 11
Joined: Wed Nov 08, 2006 4:57 pm

Post by abhishek.pradhan » Tue Nov 14, 2006 5:22 pm

Hi Frank,

Thanks a ton for your help and precious time. My taks is done.

First I had refomatted the file in which the key was at col# 23 and brought it to 13 and then took help of your code and compared the file.
It worked fine..

Again I want to thank you..

Regards,
-Abhishek

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Tue Nov 14, 2006 9:18 pm

Glad I could help.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

abhishek.pradhan
Member
Posts: 11
Joined: Wed Nov 08, 2006 4:57 pm

Post by abhishek.pradhan » Thu Nov 23, 2006 10:22 am

Hi Frank,

I think we can close this session.


Regards,
-Abhi

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Nov 23, 2006 9:31 pm

I think we can close this session.
We can leave it open. Somebody may have something relevant to add someday. I generally only close a thread when it's a duplicate.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

abhishek.pradhan
Member
Posts: 11
Joined: Wed Nov 08, 2006 4:57 pm

Post by abhishek.pradhan » Fri Nov 24, 2006 9:42 am

No issues Sir.. :)

academyindia4

Topic deleted by Admin

Post by academyindia4 » Mon Jan 25, 2016 11:07 pm

<< Content deleted By Admin >>

Post Reply

FREE TUTORIALS

Tutorials
Free tutorials from mainframegurukul
  • JCL Tutorial
    Covers all important JCL concepts.
  • Cobol Tutorial
    This tutorials covers all Cobol Topics from STRING to COMP-3.
  • DB2 Tutorial
    DB2 Tutorial focuses on DB2 COBOL Programming.
  • SORT Tutorial
    This Tutorial covers all important aspects of DFSORT with examples
  • CICS Tutorial
    This CICS tutorial covers CICS concepts and CICS Basics, CICS COBOL Programming.
Interview
Mainframe Interview questions



Other References
Mainframe Tools and others