COLLATE stage v1.1 |
Pipelines v2.1 |
Syntax |
┌─NOPAD────┐
>>──COLLATE──┼──────────┼──┤ Group ├─────────────────────────────────────────────────><
└─PAD─char─┘
Group:
┌─1-*─────────────1-*────────────┐ ┌─MASTER DETAILs─┐
├──┼────────────────────────────────┼──┼────────────────┼─────────────────────────────┤
└─columnrange1──┬──────────────┬─┘ ├─MASTER─────────┤
└─columnrange2─┘ ├─DETAILS────────┤
└─DETAILs MASTER─┘
Notes:
(1) columnrange1 and columnrange2 are unsigned.
Purpose |
Use the COLLATE stage to match records in its primary input
stream with records in its secondary input stream and write the matched and unmatched
records to different output streams. The records in each input stream must be
in ascending order based on the contents of a key field.
The records in the primary input stream are referred to as master
records. Each master record has a key field; a specific range of columns within
a record with unique contents that identifies the record. Two master records
cannot have the same contents in their key field.
The records in the secondary input stream are referred to as the
detail records. The detail records have key fields as well and both the master
and detail records should be sorted in ascending order by their key fields. A
detail record matches a master record when the key field in both records
contains the same data. Two or more detail records can have the same data in
their key field.
COLLATE writes records to three output streams if each is
connected:
● |
The primary output stream contains matching records. The operands
for COLLATE let you specify the sequence of the master and detail records in
the primary output stream and whether COLLATE writes both the master and
detail records, only the master records, or only the detail records to the
primary output stream. |
● |
The secondary output stream contains master records that do not have any matching detail records. |
● |
The tertiary output stream contains detail records that do not have a matching master record. |
Operands |
● |
NOPAD specifies that shorter key fields are not extended with a pad character before they are compared with longer key fields in other records. This is the default. |
||
● |
PAD specifies that shorter key fields are extended with a pad character before they are compared with longer key fields in other records. |
||
|
|
||
● |
is an unsigned integer column range which defines
a key field for the master records. If columnrange1 is not specified, the key
field is the entire record for both the primary and secondary input streams.
The format of columnrange1 is identical to the format specification
for columnrange2. |
||
|
|
||
● |
MASTER DETAILs specifies that the master record followed by its matching detail
records are written to the primary output stream. This is the default. |
||
● |
MASTER specifies that only the master records are written to the primary
output stream. The matched detail records are discarded. |
||
● |
DETAILs specifies that only the detail records are written to the primary
output stream. The matched master records are discarded. |
||
● |
DETAILs MASTER specifies that the matched detail records followed by their master record are written to the primary output stream. |
Streams |
The following streams are used by the COLLATE stage:
|
Usage |
1. |
COLLATE does not delay the records. |
2. |
If the COLLATE stage discovers
that all of its output streams are not connected, the COLLATE stage ends. |
3. |
The following diagram shows the
input and output streams for the COLLATE stage using the default operands: |
4. |
Although the COLLATE and LOOKUP stages
both process master and detail records read from their input streams, COLLATE
reads master records from its primary input stream and LOOKUP reads master
records from its secondary input stream. COLLATE reads detail records from
its secondary input stream and LOOKUP reads detail records from its primary
input stream. |
5. |
Unlike the LOOKUP stage, COLLATE
requires that input records be sorted in ascending order by their key field. |
6. |
COLLATE verifies that its tertiary
input stream is not connected and then begins execution. |
Examples |
1. |
The pipeline in the following
example shows how to specify a COLLATE stage as shown in the preceding
diagram. The COLLATE stage reads records from its primary and secondary input
streams and writes the contents of its primary, secondary and tertiary output
streams to separate files.
**** Top of file **** 1 Address Rxpipe 2 3 'pipe (endchar ?)', 4 '< master.txt', /* Read master records. */ 5 '| c: collate', /* Find matches. */ 6 '| > matchingrecords.txt', /* Write matching masters and details. */ 7 '?', 8 '< details.txt', /* Read detail records. */ 9 '| c:', /* Define secondary stream for COLLATE. */ 10 '| > unrefmasters.txt', /* Write masters without details. */ 11 '?', 12 'c:', /* Define tertiary stream for COLLATE. */ 13 '| > unrefdetails.txt' /* Write details without masters. */ 14 15 Exit 0 **** End of file ****
|
||
2. |
In
this example, COLLATE matches records from two files. The records from the file:
account.txt are the master records
for the COLLATE stage, and the records from the file: accountype.txt are the detail records for the COLLATE stage. Note: that account.txt and accounttype.txt
are in ascending order by their key field; columns 1-19.
The
following pipeline reads the two input files: account.txt and accounttype.txt
and performs a collate between the two; matching account types against the
master account index in order to build the output file: allaccounts.txt. **** Top of file **** 1 Address Rxpipe 2 3 'pipe (endchar ?)', 4 '< account.txt', /* read bankinfo.txt */ 5 '| c: collate 1-19 master detail', /* Match the records */ 6 '|> allaccounts.txt', /* Write matching master and detail.. */ 7 , /* ..records to bankaccount.txt */ 8 '?', /* Start of the second pipeline */ 9 '< accounttype.txt', /* Read accountinfo.txt.*/ 10 '| c:' /* Define secondary input for COLLATE */ **** End of file **** The resulting output file: allaccounts.txt is shown below. allaccounts.txt (output) ...|...+....1....+....2....+....3....+....4....5.... **** Top of file **** 1
Alfred, John Account Number:
22222 2
Alfred, John Checking £
350.00 3
Alfred, John Savings £1,300.00 4
Alfred, John Money Market £9,000.00 5
Conners, Steve Account Number:
98989 6
Conners, Steve Savings £ 50.00 7
Smith, Andrew Account Number:
54545 8
Smith, Andrew Savings £1,999.00 9
Smith, Andrew Money Market £9,999.00 10 Smith, Justin Account
Number: 77777 11 Smith, Justin
Checking £ .50 **** End of file **** |
||
3. |
This example uses the same input
files as in Example 2; account.txt
and accounttype.txt, above,
however, the COLLATE stage connects to its secondary output stream in order
to write the master/key records that do not have an account type record. **** Top of file **** 1
'pipe (endchar ?)', 2 '< account.txt', /*
read BANK INFO */ 3 '| c: collate 1-19 detail', /* match records */ 4 '| > balance.txt', /*
write matching detail records */ 5 '?', /* start of second pipeline */ 6 '< accounttype.txt', /* read
ACCOUNT INFO */ 7 '| c:', /*
define secondary streams for COLLATE */ 8 '| > noaccounttype.txt' /* write
unmatched master records */ **** End of file **** The resulting output files: balance.txt and noaccount.txt are shown below.
|
||
4. |
In this example, the file: stopwords.txt, contains a list of
words to suppress, sorted in ascending order. This user-written stage removes
all occurrences of those words from the caller's input stream. **** Top of file **** 1
'pipe
(endchar ?)', 2 'in', /*
Connection from CALLPIPE */ 3 '| split', /* Split
records at blanks and discard blanks */ 4 '| sort
unique', /*
Sort records and discard duplicates */ 5 '| c: collate', /* Match
records */ 6 '?', /*
Start of second pipeline */ 7 '< stopwords.txt', /* Read
stopwords.txt. */ 8 '| c:', /*
Define secondary streams for COLLATE and.. */ 9 , /*
..write unmatched master records to secondary.. */ 10 , /*
..output of COLLATE */ 11 '| out' /*
Connection back to CALLPIPE. */ **** End of file **** |
||
5. |
Related |
History |
Version |
Date |
Action |
Description |
Pipelines |
1.1 |
23.12.2021 |
changed |
Application-wide rewrite. |
|
1.0 |
06.09.2007 |
created |
First version. |
|