Split

SPLIT stage v1.2

Pipelines v2.1

Purpose, Operands, Streams, Usage, Examples, Related

Part of examples section does not wrap!

Syntax

           ┌─AT───────────────────────────────────────────┐

>>──SPLIT──┼──────────────────────────────────────────────┼──────────────────────────><

           │ ┌─AT─────┐                                   │

           └─┼────────┼──┬─────────┬──┬─────────────────┬─┘

             ├─BEFORE─┤  └─ANYCase─┘  ├─charrange───────┤

             └─AFTER──┘               ├─STRing─┬─string─┘

                                      ├─REGexp─┤

                                      └─ANYof──┘

Purpose

Use the SPLIT stage to divide input records into multiple output records. SPLIT reads records from its primary input stream; splits the records and writes the resulting records to its primary output stream. If no operands are specified; SPLIT divides records at whitespace characters (both space (X'20) and tab (X'09') characters are considered to be whitespace) and discards them. If you only specify the AT, BEFORE or AFTER operands; records are split at whitespace characters, before whitespace characters or after whitespace characters, respectively. With additional operands, records are split relative to occurrences of a specified target.

Operands

●

causes input records to be split at the specified target. The target characters are discarded. AT is the default.

●

BEFORE

causes input records to be split before the specified target. The target characters are retained.

●

AFTER

causes input records to be split after the specified target. The target characters are retained.

If you specify AFTER with STRING, the records are split after any columns that contain the last character of the string.

●

ANYCase

specifies that charrange or string is compared with the input record in uppercase. In effect this means that a non-case-sensitive comparison is made when selecting characters that cause the records to be split.

●

charrange

is a character range. A split occurs when any one of the characters in the range is matched.

●

STRing

specifies that the string operand is a literal string of characters to locate. A split occurs only when the entire string is matched.

●

REGexp

specifies that the string operand is a regular expression of characters to locate. A split occurs when the expression is matched.

●

ANYof

specifies that the string operand is a list of characters to locate. A split occurs when any one of the characters in the list is matched.

●	string is a string to locate.

Streams

The following streams are used by the SPLIT stage:

Stream	Action

Primary input stream	SPLIT reads records from its primary input stream.
Primary output stream	After splitting the input records into multiple records, SPLIT writes the resulting records to its primary output stream.

Usage

1.	SPLIT does not delay the records.
2.	If the SPLIT stage discovers that its primary input or output streams are not connected, the SPLIT stage ends.
3.	SPLIT copies null input records to its primary output stream. It does not generate null output records.
4.	SPLIT verifies that its secondary input and output streams are not connected and then begins execution.

Examples

Given the input file: input.txt, below; the following four examples demonstrate the before and after operands of the SPLIT stage.

input.txt (input)

...|...+....1....+....2....+....3....

   **** Top of file ****

 1 1234512345

 2 5432154321

   **** End of file ****

1.	To split input records before the column that is 1 column to the left of each occurrence of the character 4, use the following: 'pipe < input.txt \| split 1 before string /4/ \| console' output: 12 34512 345 54321 54321
2.	To split input records after the column that is 2 columns to the right of each occurrence of the last character in string 12, use the following: 'pipe < INPUT DATA \| split 2 after string /12/ \| console' output: 1234 51234 5 5432154321
3.	To split input records between the characters 5 and 4 in occurrences of the string 54, use the following: 'pipe < input data \| split -1 before string /54/ \| console' output: 1234512345 5 43215 4321
4.	To split input records between the characters 2 and 3 in occurrences of the string 123, use the following: 'pipe < input data \| split -1 after string /123/ \| console' output: 12 34512 345 5432154321

Other miscellaneous examples:

The following example utilises the SPLIT stage command in a pipeline which determines the number of bytes that could be saved by removing trailing whitespace.

   **** Top of file ****

 1 Address Rxpipe

 3 'pipe < myfile.txt ',

 4    '| locate',                         /* Discard blank-lines. */

 5    '| xlate w-1;* x20 @ x09 @',        /* Change spaces/tabs to at(@) chars. */

 6    '| split before str /@/',           /* Split at each at(@), start new record. */

 7    '| strip trailing anyof /@/',       /* Reduce records to length zero. */

 8    '| nlocate 1',                      /* Select only null/empty records. */

 9    '| count',                          /* Count the records. */

10    '| specs /The number of bytes which could be saved is:/ 1 1-* nw',

11    '| cons'                            /* Display the result. */

13 Exit 0

   **** End of file ****

Related

CHOP, JOIN, PAD, STRIP

History

Version	Date	Action	Description	Pipelines
1.2	27.12.2021	changed	Application-wide rewrite.	2.1
1.1	22.03.2008	added	Support for the REGEXP operand; which specifies that the string operand is interpreted as a regular expression.	1.4
1.0	06.09.2007	created	First version.	1.0