Create files with matching and non matching records- SPLICE?
Posted: Mon Oct 17, 2011 10:50 pm
I've been reading up on the use of the SPLICE operator in ICETOOL and to be honest, I'm struggling a little with it. I have 2 input datasets, I want to compare one dataset with the other and output all records that have specific strings in both datasets to a 'Matching' dataset and the records that do not match, to a 'Not Matching' dataset.
Here's a sample of the INPUT datasets (we'll call them input1 and input2 for now).
INPUT1
02:02:41 * ESS26000,YPAT,G03794
02:02:41 * ESS25000,NEAT,G22857
02:04:41 * ESS26000,YPCS,G07829
02:04:47 * ESS26000,YPCT,G07076
02:32:41 * ESS26000,YPAT,G03795
02:41:41 * ESS26000,GKAT,G32147
02:42:41 * ESS26000,YPAT,G03796
INPUT2
DOWNLOAD: ESS26000,YPAT,G03794
DOWNLOAD: ESS25000,NEAT,G22857
DOWNLOAD: ESS26000,YPCS,G07829
DOWNLOAD: ESS26000,YPCT,G07076
DOWNLOAD: ESS26000,YPAT,G03795
DOWNLOAD: ESS26000,GKAT,G32147
DOWNLOAD: ESS26000,YPAT,G03796
DOWNLOAD: ESS25000,NEAT,G22858
DOWNLOAD: ESS25000,NEAT,G22859
DOWNLOAD: ESS25000,NECS,G22093
Both datasets are have the same record lenght (266), with the output datasets also needing to be 266. The fields I need to compare on start in column 1 for 20 in the input1 dataset, and column 61 for 20 in the input2 dataset (the 'ESS26000,YPAT,G03794' string for example)
If a string within these column boundaries appears in both input dataset, I want to output columns 1 to 33 from the input1 dataset and columns 61 to 266 from the input2 dataset into a the 'Matching' dataset.
If the string within these column boundaries appears does not appear in both datasets, then I want to output columns 1 to 266 from the input dataset when the unmatched string resides.
Maybe a fresh start tomorrow will help me sort this myself, but right now I feel like I've gone cross eye'd!
Help would be much appreciated.
Thanks in advance.
Here's a sample of the INPUT datasets (we'll call them input1 and input2 for now).
INPUT1
02:02:41 * ESS26000,YPAT,G03794
02:02:41 * ESS25000,NEAT,G22857
02:04:41 * ESS26000,YPCS,G07829
02:04:47 * ESS26000,YPCT,G07076
02:32:41 * ESS26000,YPAT,G03795
02:41:41 * ESS26000,GKAT,G32147
02:42:41 * ESS26000,YPAT,G03796
INPUT2
DOWNLOAD: ESS26000,YPAT,G03794
DOWNLOAD: ESS25000,NEAT,G22857
DOWNLOAD: ESS26000,YPCS,G07829
DOWNLOAD: ESS26000,YPCT,G07076
DOWNLOAD: ESS26000,YPAT,G03795
DOWNLOAD: ESS26000,GKAT,G32147
DOWNLOAD: ESS26000,YPAT,G03796
DOWNLOAD: ESS25000,NEAT,G22858
DOWNLOAD: ESS25000,NEAT,G22859
DOWNLOAD: ESS25000,NECS,G22093
Both datasets are have the same record lenght (266), with the output datasets also needing to be 266. The fields I need to compare on start in column 1 for 20 in the input1 dataset, and column 61 for 20 in the input2 dataset (the 'ESS26000,YPAT,G03794' string for example)
If a string within these column boundaries appears in both input dataset, I want to output columns 1 to 33 from the input1 dataset and columns 61 to 266 from the input2 dataset into a the 'Matching' dataset.
If the string within these column boundaries appears does not appear in both datasets, then I want to output columns 1 to 266 from the input dataset when the unmatched string resides.
Maybe a fresh start tomorrow will help me sort this myself, but right now I feel like I've gone cross eye'd!
Help would be much appreciated.
Thanks in advance.