Posted: Fri Jun 12, 2009 9:06 pm Post subject: Multiple spliced records for each match in 2 files
I wish to write multiple records by matching 2 files which has the same key. For example,
I want the output to look like
NACHI MAT STG
NACHI MAT SAS
FRANK MAT STG
FRANK MAT SAS
Read the first record, and find for the key in pos (7,3) is in (1,3) of file 2. If there are multiple records in FILE2 for the same key, write the base record that many times, and append with chars at (5,3) in overlay record.
Your both input files have duplicates which when matched will be a Cartesian join. In order to perform such task you need a different approach. The following DFSORT/ICETOOL will give you desired results.
//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD *
//IN2 DD *
//T1 DD DSN=&&T1,DISP=(MOD,PASS),SPACE=(CYL,(X,Y),RLSE)
//OUT DD SYSOUT=*
//TOOLIN DD *
SORT FROM(IN1) USING(CTL1)
SORT FROM(IN2) USING(CTL2)
SPLICE FROM(T1) TO(OUT) ON(15,11,CH) -
WITH(1,10) WITHALL USING(CTL3)
//CTL1CNTL DD *
In this same context, my input is 14 million rows and I have 25 max duplicate keys. This join multiplies 14 million with 25 which will be a huge number of rows in the temp. I tried running it but with no surprise my SORT job went down due to size limitations. Is this is the only solution for this scenario?
As a workaround, I have split the files into 14 parts, 1 million each and started working with 25 million rows. Each job is taking atleast 20 minutes to run.
The simplest way I can think of is writing a COBOL program with Internal table but that is definitely going to consume atleast 8-16 hrs for development.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum