Posted: Thu Aug 30, 2012 1:15 pm Post subject: Special format for last record in each group
Hi, in a project to optimize batch jobs, we have found the following case:
1) DFSORT that unifies two flat files sorting by a key (there may be duplicates).
2) COBOL program that adds a sequence in a position of the file for each set of identical keys, and mark to indicate the last record of each key.
3) REPRO to copy the output from step 2 to VSAM.
The idea is to replace the above by a DFSORT step, but the drawback I see is that I can not find a way to identify the last record in each group in order to reformat it properly in a single pass (this is necessary because files are more than 50 million records and the objective is to optimize the execution).
We thought in a mixed solution which could be a Cobol program with DFSORT / FASTSRT, which would optimize the reading of the input file (using the "USING file"), add the sequence using DFSORT statements, it to mark the last record of each group in the "OUTPUT-PROCEDURE" and write directly on the VSAM (in this case unoptimized).
The ideal scenario is to avoid this and leave only the DFSORT, but that requires something that I can find, so I humbly ask for help.
The following examples to illustrate what I need to do:
Unfortunately, in my company, tech people are quite reluctant to implement upgrades (even free). Believe me, I've asked many times to do so, without favorable results so far.
I hope this can help convince them.
Fortunately in a few months we will switch mainframe, and as I have been informed we will be more updated.
William C., yes, we have available the option WHEN = GROUP.
NicC, I would like to know the solution using JOINKEYS, so if possible, give me the link to the forum, please.
First thing you need is to concatenate a "dummy" record after your data. The contents of the record are irrelevant, but I put "DUMMYDUMMY" to make it clear.
A COPY operation.
In INREC the length is going to be extended to 63 bytes.
First up on the WHEN=INIT is the establishment of a sequence number, which is then "modulus 2'd" to give a value of either 0 or 1. The sequence number is no longer needed, but I left it there whilst developing.
Then a GROUP is established when a 0 is encountered, and the GROUP contains two RECORDS. The "key" is pushed, as is the whole record. This can be rationalised.
In a similar manner a second GROUP is established, when 1 is encountered. The PUSH of the key and the record are in different locations from the previous GROUP.
The next IFTHEN is more to get the "blank" key out of the way. The "blank" is caused by the first record not having had two keys PUSHed on to it. The IFTHEN is present really just to keep that record out of the following test.
In the next IFTHEN the PUSHed keys are compared. If a mismatch, the record is marked as being the last. Note, at this stage the "wrong" record is marked.
Then in OUTREC, to avoid the need for HIT=NEXT, the PUSHed entire record, appropriate to the 0/1 marker, is placed in position 1.
The only thing now is that the first record is blank. OUTFIL OMIT takes care of that. The dummy record has disappeared by being overwritten by the final PUSHed whole record.
Note, I "enhanced" your sample date to get a mix of odd/even records in the key groups.
Test it well. There is room for improvement/rationalisation, as it was easier to develop whilst keeping redundant data.
This allows you to leave the Sort Control Cards "untouched".
These you need to change to your values:
INPUT-RECORD,1,20,CH /* 20 to your length
INPUT-KEY,=,3,CH /* 3 to your key length
INPUT-DATA,=,20,CH /* 20 to your length
OUTPUT-RECORD,=,23,CH /* 23 to your output length
OUTPUT-KEY,=,3,CH /* 3 to your key length
OUTPUT-ORIG-DATA,=,20,CH /* 20 to your original length
OUTPUT-MARKER,*,3,CH /* to position and length of your new data, * for appending
END-OF-GROUP-MARKER,C'BBB' /* the value to mark end of group
Note, if you can have more than 999 records with same key, you'll need to extend the size of that SEQ as well as the length of TEMP-GROUP-SEQ in SYMNAMES.
The solution is based on this: It is easy to mark the first record in a group; if the records can be "moved down" whilst the marker stays where it is, then the last record of the previous group will have been marked.
The logic has changed from the previous version.
The INIT now has a sequence number for the file. This is because it will be more reliable to OMIT the first record of the file later rather than a "blank record", just in case a blank record genuinely exists in the file.
Also on the INIT, is the sequence number for the key group. If KEYBEGIN is available, this can be removed and the GROUP with sequence equal to 1 amended to use KEYBEGIN (thanks to sk for suggesting KEYBEGIN)..
Finally on the INIT is the marker value, copied from a Symbol. If it exists on the record it can be PUSHed later, a constant cannot be PUSHed.
Rather than using MOD to set up a 0/1 value, a GROUP with two RECORDS is now used to set up a 1/2 value. (thanks to sk).
The GROUP with test for FIRST-RECORD-OF-GROUP replaces the comparison of the keys.
I have tested with a different length record, key and marker value.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum