|
|
| Author |
Message |
abhishek.pradhan Member
Joined: 08 Nov 2006 Posts: 11
|
Posted: Thu Nov 09, 2006 11:32 am Post subject: comparing two flat files |
|
|
Hi,
I want to compare two files for which the senerio is:
The comapre field in the first file is at col.no 13 and in the other file its on the col.no 23.(both have same length and same format)
Now, I want to compare the two file based on these field and remove the duplicate records.
Is this possible with sort?
-Regards
Abhishek |
|
| Back to top |
|
|
|
Frank Yaeger Moderator

Joined: 18 Feb 2006 Posts: 487 Location: San Jose, CA
|
Posted: Thu Nov 09, 2006 9:49 pm Post subject: |
|
|
It's not clear what exactly you want to do. What is the RECFM and LRECL of each input file? What is the length and format of the field in each input file you want to compare? Does input file1 have duplicates within it? Does input file2 have duplicates within it? Please show an example of the records in each input file and what you expect for output. _________________ Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/ |
|
| Back to top |
|
|
|
abhishek.pradhan Member
Joined: 08 Nov 2006 Posts: 11
|
Posted: Fri Nov 10, 2006 1:06 am Post subject: |
|
|
Frank,
please find the answeres to your queries.
1. What is the RECFM and LRECL of each input file?
LRECL for FILEA is 478 and the for the other is 435.RECFM is FB for both the files.
2.What is the length and format of the field in each input file you want to compare?
Format is ZD and the length is 14.
3. Does input file1 have duplicates within it? MAY HAVE
4. Does input file2 have duplicates within it? NO
5.Please show an example of the records in each input file and what you expect for output.
FILEA
---------1--------2---------3---------4---------5---------6---------7---
cols:1-12 PD
cols:13-26 ZD
cols:27-478(includes different fields)
FILEB
---------1--------2---------3---------4---------5---------6---------7---
cols:1-12 PD
cols:13-22 ZD
cols:23-37 ZD
and rest as above..
I won't be able to provide data at this time because m not having mainframe connectivity at home. will provide you by late night.
Comparision has to done on cols:13-26(FILEA) & cols:23-37(FILEB)
please let me know if you want any information.
-Abhi |
|
| Back to top |
|
|
|
Frank Yaeger Moderator

Joined: 18 Feb 2006 Posts: 487 Location: San Jose, CA
|
Posted: Fri Nov 10, 2006 1:35 am Post subject: |
|
|
| Quote: | | Now, I want to compare the two file based on these field and remove the duplicate records. |
This is a rather ambiguous statement. For the output, do you want records from file1 only, or records from file2 only, or records from both files? Do you want to keep one record for each key (from which file?), or do you want to eliminate all records that have a match (from both files), or what?
I wanted to see an example of your input records and output records to determine what you actually want to do. A conceptual example like this will do:
| Code: |
File 1
F1R1 01
F1R2 01
F1R3 05
F1R4 10
F1R5 10
F1R6 10
F1R7 15
F1R8 20
F1R9 20
File2
F2R1 01
F2R2 10
F2R3 15
F2R4 18
Output
???
|
_________________ Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/ |
|
| Back to top |
|
|
|
abhishek.pradhan Member
Joined: 08 Nov 2006 Posts: 11
|
Posted: Fri Nov 10, 2006 11:29 am Post subject: |
|
|
Code:
File 1
| Code: |
a/c loc name age add
------------------------
F1R1 01
F1R1 01
F1R1 05
F1R1 10
F1R1 10
F1R1 10
F1R1 15
F1R1 20
F1R1 20
|
File2
| Code: |
a/c status loc
-------------
F1R1 S1 01
F1R1 S2 10
F1R1 S3 15
F1R1 S4 18
|
Output
| Code: |
a/c loc name age add
-----------------------
F1R1 05
F1R1 20
|
FILEA
=COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
000001 ?? 01200607212014190418112064 Kincumber 124
000002 ?? 15200607212137080418112064 Chatswood 122
000003 ?? 01200607212138230418112064 NL Chatswood 947
000004 ?? 01200607212139060418112064 NL Chatswood 941
000005 ?? 01200607212139400418112064 NL Chatswood 941
000006 ?? 15200607212139590418112064 Chatswood 122
FILEB
=COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
000001 ?? 2006082H O0120060721201419 041811206412488 Kincu
000002 ?? 2006082H O0120060721213823 04181120640294778400 Chats
000003 ?? 2006082H O0120060721213906 04181120640294118400 Chats
000004 ?? 2006082H O0120060721213940 04181120640294118400 Chats
000005 ?? 2006082H O0120060721214055 04181120640294118000 Chats
000006 ?? 2006082H O1520060721213708 04181120641223 Chats
000007 ?? 2006082H O1520060721213959 04181120641223 Chats
The data marked in red is the key field.
Now, I want that if the key from FILEB matches key in FILEA the it should be eliminated and if it does not matches, the record should be written to the output file(the output record should be in the format of FILEA)
NOTE:
FILEA LRECL=478, FILEB LRECL=435
RECFM is same for both files. |
|
| Back to top |
|
|
|
Frank Yaeger Moderator

Joined: 18 Feb 2006 Posts: 487 Location: San Jose, CA
|
Posted: Fri Nov 10, 2006 11:00 pm Post subject: |
|
|
There are two 20 records in file1, but you only show one 20 record in the output file, so I assumed you only want one record with each key in the output file. I also assumed that the key in file1 starts in position 13 and the key in file2 starts in position 23 as you said in your first post. Given that, here's a DFSORT/ICETOOL job that will do what you asked for:
| Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD DSN=... input file1 (FB/478)
//IN2 DD DSN=... input file2 (FB/435)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD DSN=... output file (FB/478)
//TOOLIN DD *
COPY FROM(IN2) TO(T1) USING(CTL1)
COPY FROM(IN1) TO(T1) USING(CTL2)
SPLICE FROM(T1) TO(OUT) ON(13,14,CH) -
KEEPNODUPS WITH(1,479) USING(CTL3)
/*
//CTL1CNTL DD *
INREC BUILD=(13:23,14,479:C'BB')
/*
//CTL2CNTL DD *
INREC OVERLAY=(479:C'VV')
/*
//CTL3CNTL DD *
OUTFIL FNAMES=OUT,INCLUDE=(479,2,CH,EQ,C'VV'),
BUILD=(1,478)
/*
|
_________________ Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/ |
|
| Back to top |
|
|
|
abhishek.pradhan Member
Joined: 08 Nov 2006 Posts: 11
|
Posted: Tue Nov 14, 2006 5:22 pm Post subject: |
|
|
Hi Frank,
Thanks a ton for your help and precious time. My taks is done.
First I had refomatted the file in which the key was at col# 23 and brought it to 13 and then took help of your code and compared the file.
It worked fine..
Again I want to thank you..
Regards,
-Abhishek |
|
| Back to top |
|
|
|
Frank Yaeger Moderator

Joined: 18 Feb 2006 Posts: 487 Location: San Jose, CA
|
Posted: Tue Nov 14, 2006 9:18 pm Post subject: |
|
|
Glad I could help. _________________ Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/ |
|
| Back to top |
|
|
|
abhishek.pradhan Member
Joined: 08 Nov 2006 Posts: 11
|
Posted: Thu Nov 23, 2006 10:22 am Post subject: |
|
|
Hi Frank,
I think we can close this session.
Regards,
-Abhi |
|
| Back to top |
|
|
|
Frank Yaeger Moderator

Joined: 18 Feb 2006 Posts: 487 Location: San Jose, CA
|
Posted: Thu Nov 23, 2006 9:31 pm Post subject: |
|
|
| Quote: | | I think we can close this session. |
We can leave it open. Somebody may have something relevant to add someday. I generally only close a thread when it's a duplicate. _________________ Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort/ |
|
| Back to top |
|
|
|
abhishek.pradhan Member
Joined: 08 Nov 2006 Posts: 11
|
Posted: Fri Nov 24, 2006 9:42 am Post subject: |
|
|
No issues Sir..  |
|
| Back to top |
|
|
|
|
|
|