Merging two files into one based on certain key fields:

In this Mainframe Forum - You can post your queries on DFSORT, ICETOOL , SyncSort & JCL Utilities

Moderators: Frank Yaeger, Moderator Group

Post Reply
MVS_blob
Member
Posts: 6
Joined: Wed Oct 24, 2007 5:27 am

Merging two files into one based on certain key fields:

Post by MVS_blob » Wed Oct 24, 2007 5:38 am

Hi,

I've got a tricky one for those up for a challenge. :D

I have two input files, and want to create an output file. I'm adding a two digit number to a file containing a header and detail records. Note: the two digit number is dependent on the header and detail records.

INPUT1:

Code: Select all

AAA 7777777777 56
BBB 3333333333 47
CCC 1111111111 84
MMM 2222222222 35
PPP 2222222222 69
INPUT2:

Code: Select all

HEADER2222222222
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL MMM
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER1111111111
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL CCC
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX
DETAIL CCC
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER3333333333
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL BBB
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER2222222222
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL PPP
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL PPP
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER7777777777
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL AAA
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX
Output file:

Code: Select all

HEADER2222222222        35
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL MMM
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER1111111111        84
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL CCC
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX
DETAIL CCC
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER3333333333        47
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL BBB
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER2222222222        69
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL PPP
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL PPP
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX

HEADER7777777777        56
DATAXXXXXXXXXXXXXXXXXXXXXX
DEATIL AAA
DATAXXXXXXXXXXXXXXXXXXXXXX
DATAXXXXXXXXXXXXXXXXXXXXXX
Constraints:
---------------

INPUT1:
column 1 and column 3 are unique
column 2 is not unique

INPUT2:
DATAXXXXXXXXXXXXXXXXXXXXXX is some data between the header and detail lines. These are specific to the header and detail lines, and should be kept as is.

OUTPUT:
column 3 of INPUT1 has been added to the header file of INPUT2.


I've tried to split up the task into multiple steps and work at each step with limited success. Been working on this one for a while now, and am getting no where, so any help would be greatly appreciated.

I know this can be done via COBOL/EASYTRIEVE, but if I can achieve this via JCL/DFSORT/ICETOOL, it'd be great.

MVS_blob
Member
Posts: 6
Joined: Wed Oct 24, 2007 5:27 am

Post by MVS_blob » Wed Oct 24, 2007 12:56 pm

Been working on this problem today, and have gotten somewhere with it.

My steps were: (excuse my noob-ness)

1. program: DFSORT
-extract all HEADER rows to FILEA & append a 6 digit sequence number to the end
-extract all DETAIL rows to FILEB & append a 6 digit sequence number to the end
2. program: DFSORT
- remove duplicates in FILEB (DEATIL rows)
3. program: DFSORT
- sort FILEB according to the 6 digit sequence number.
- re-number the 6 digit sequence number append to FILEA
4. program: ICETOOL
- SPLICE the FILEA to get FILEC which looks like

Code: Select all

MMM 2222222222    000001
CCC 1111111111    000002
BBB 3333333333    000003
PPP 2222222222    000004
AAA 7777777777    000005
5. program: SORTD
- append INPUT1 to FILEC
6. program: ICETOOL
- SPLICE FILEC
7. program: SORTD
- sort FILED according to the sequence number. don't output the sequence number:

Code: Select all

MMM 2222222222 35 
CCC 1111111111 84 
BBB 3333333333 47 
PPP 2222222222 69
AAA 7777777777 56 
which lists the headers in the order they appear, and includes the two digit field (column 3 of INPUT1).

This is probably the long winded way, and can probably be done in alot fewer steps. If you have a shorter way, please don't hesitate to post up a suggestion.

Also, as the actual files are quite large, performance isn't too great at the moment, so a short way would be MUCH appreciated.

Now to somehow merge the 2 digit field with my INPUT2 file. hmmm... any ideas?

... am I totally off track?

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Wed Oct 24, 2007 10:57 pm

Yes, this is tricky.

Here's a DFSORT/ICETOOL job that will do what I think you asked for. I assumed your input files have RECFM=FB and LRECL=80, but the job can be changed appropriately for other attributes.

I also assumed that:
- positions 5-14 of input file1 are to be matched with positions 7-16 of the HEADER record in input file2, AND
- positions 1-3 of input file1 are to be matched with positions 8-10 of the first DEATIL record after the HEADER record in input file2

Code: Select all

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN1 DD DSN=...  input file1 (FB/80)
//IN2 DD DSN=...  input file2 (FB/80)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//T2 DD DSN=&&T2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//T3 DD DSN=&&T3,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD DSN=...  output file (FB/80)
//TOOLIN DD *
COPY FROM(IN2) TO(T2) USING(CTL1)
SPLICE FROM(T2) TO(T1) ON(81,8,ZD) -
 WITHEACH WITH(28,3)
COPY FROM(IN1) TO(T1) USING(CTL2)
SPLICE FROM(T1) TO(T3) ON(7,10,CH) ON(28,3,CH) -
 WITH(25,2)
SPLICE FROM(T3) TO(OUT) ON(81,8,ZD) ON(89,8,CH) KEEPNODUPS -
 WITH(25,2) USING(CTL3)
/*
//CTL1CNTL DD *
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(81:SEQNUM,8,ZD)),
        IFTHEN=(WHEN=(1,6,CH,EQ,C'HEADER'),
                OVERLAY=(81:SEQNUM,8,ZD)),
        IFTHEN=(WHEN=NONE,
                OVERLAY=(89:SEQNUM,8,ZD,
                         81:81,8,ZD,SUB,89,8,ZD,M11,LENGTH=8))
  OUTFIL FNAMES=T2,
     INCLUDE=(1,6,SS,EQ,C'HEADER,DEATIL'),
     IFTHEN=(WHEN=(1,6,CH,EQ,C'DEATIL'),
       BUILD=(28:8,3,81:81,16))
  OUTFIL FNAMES=T3
/*
//CTL2CNTL DD *
  INREC BUILD=(7:5,10,25:16,2,28:1,3,81:16X)
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=OUT,BUILD=(1,80)
/*
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

MVS_blob
Member
Posts: 6
Joined: Wed Oct 24, 2007 5:27 am

Post by MVS_blob » Thu Oct 25, 2007 7:30 am

Frank, you're a genius!!! :mrgreen:

Thanks soo much! It works perfectly.

My original code was 100+ lines, was slow, and only did 1/3 of the things yours does.

Great work!

User avatar
Frank Yaeger
Moderator
Posts: 812
Joined: Sat Feb 18, 2006 5:45 am
Location: San Jose, CA
Contact:

Post by Frank Yaeger » Thu Oct 25, 2007 9:06 pm

Thanks for the feedback and the kind words. Glad I could help.
Frank Yaeger - DFSORT Development Team (IBM) - yaeger@us.ibm.com
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
=> DFSORT/MVS is on the Web at http://www.ibm.com/storage/dfsort

academyindia4

Topic deleted by Admin

Post by academyindia4 » Mon Jan 25, 2016 11:04 pm

<< Content deleted By Admin >>

Post Reply

FREE TUTORIALS

Tutorials
Free tutorials from mainframegurukul
  • JCL Tutorial
    Covers all important JCL concepts.
  • Cobol Tutorial
    This tutorials covers all Cobol Topics from STRING to COMP-3.
  • DB2 Tutorial
    DB2 Tutorial focuses on DB2 COBOL Programming.
  • SORT Tutorial
    This Tutorial covers all important aspects of DFSORT with examples
  • CICS Tutorial
    This CICS tutorial covers CICS concepts and CICS Basics, CICS COBOL Programming.
Interview
Mainframe Interview questions



Other References
Mainframe Tools and others