LOCATE stage v1.3 |
Pipelines v2.1 |
Syntax |
┌─1-*────────────────┐
>>──LOCate──┬─────────┬──┼────────────────────┼──┤ Set ├──┤ String ├─────────────────><
└─ANYCase─┘ ├─inputrange─────────┤
│ ┌─<──────────┐ │
└─(─┴─inputrange─┴─)─┘
Set:
├──┬────────────────────────┬─────────────────────────────────────────────────────────┤
│ ┌─BOTH───┐
│
└─TAKE─count──┼────────┼─┘
├─BEFORE─┤
└─AFTER──┘
String:
├──┬────────────────────┬─────────────────────────────────────────────────────────────┤
│ ┌─STRing──┐ │
└─┼─────────┼─string─┘
├─REGex───┤
├─PATtern─┤
└─ANYof───┘
Notes:
(1) If you specify ANYCASE, multiple inputrange operands or any of the string operand
operands; you must provide a string to locate.
(2) You cannot specify a Set of records without specifying other operands.
Purpose |
Use the LOCATE stage to select records that contain the specified
target string of characters. LOCATE
writes primary input stream records that contain the specified string to its primary output stream. If
its secondary output stream is connected; LOCATE writes the unselected input
records to its secondary output stream; otherwise they are discarded.
LOCATE searches for the string
within one or more locations of the input record. If you do not specify an inputrange, LOCATE searches the entire
input record. When a single inputrange
location is specified, and you do not specify the string operand or you specify a null string; LOCATE writes to its primary output stream only input
records of length inputrange or
greater.
LOCATE can select a Set of records which precede and/or follow
the target record; writing each Set of records to the primary output stream.
Records which are not selected are written to the secondary output stream, if
it is connected; otherwise they are discarded. Records that are selected as part of Set are prefixed with a record offset number; a ten
digit right-aligned number which represents the number of the record which
precedes or follows the record containing the target string. Records which
precede the target are negative and those which follow are positive.
Operands |
● |
ANYCase specifies that when LOCATE
compares the contents of an input record specified by inputrange and the target string,
the comparison made is non-case-sensitive. |
||
● |
is an integer column, word or field range on which to
operate. |
||
● |
TAKE |
||
|
|||
|
|||
|
|||
|
|||
● |
STRing specifies that the string
operand is a literal string of characters to locate. |
||
● |
REGex specifies that the string operand is a regular expression of characters to locate. |
||
● |
PATtern specifies that the string
operand is a pattern of characters to locate. |
||
● |
ANYof specifies that
the string operand is a list of
characters, any of which are to be located. |
||
|
|
Streams
used |
The following streams are used by the LOCATE stage:
|
Usage
notes |
1. |
LOCATE
without the count operand does not delay the records. LOCATE
used with the count and BOTH or
BEFORE operands; delays every count
number of records. |
2. |
If the
LOCATE stage discovers that its primary input stream is not connected, the
LOCATE stage ends. |
3. |
If
no operands are specified before string,
and if string consists of only
decimal numbers (0-9), you cannot specify a left parentheses or a number as
the delimiting character. For example: LOCATE /5/
is not equivalent to LOCATE (5(
The first LOCATE stage selects records that contain the string 5. The second stage results in
an error message, because (5( is processed as a
number range rather than a delimited string. |
4. |
If no inputrange
is specified; you cannot specify an asterisk (*) as the delimiting
character for string when it
consists of only a hyphen (-). |
5. |
If you specify the TAKE count operands; LOCATE will always finish selecting a Set of
records before it begins searching for a new occurrence of the target string, for example:
**** Top of file **** 1 Address Rxpipe 2 3 'pipe lit /A B C D D D E F G/', 4 '| split', 5 '| locate anycase take 2 both str /d/', 6 '| cons' 7 8 Exit 0 **** End of file ****
output: -2 B -1 C 0 D 1 D 2 D
In the example above; the second
and third occurrences of D are treated as the first and second records in the
Set of records which follow the target record and not the start of another
Set. |
6. |
You can use the LOCATE stage
followed by an NOTLOCATE stage to select records of a particular length. For
example, the following pipeline displays only those records of the file; myfile.txt that are exactly 20
characters long: 'pipe < myfile.txt | locate 20 | notlocate 21 | console' |
7. |
If you want to remove blank lines from a file; use the STRIP stage to remove leading and trailing spaces, and then use LOCATE with no operands to search for records of length 1 or greater. For example: 'pipe < myfile.txt | strip | locate | > myfile.txt' |
8. |
You may want to locate/identify
records or sections of a record that comprise only the specified characters.
The following example demonstrates how you might use the regular expression
capabilities of the LOCATE stage to VERIFY that a record only contains
numeric digits. **** Top of file **** 1 Address Rxpipe 2 3 'pipe lit /12345 12T45/', 4 '| split', 5 '| locate reg /^[0-9]+$/', 6 '| cons' 7 8 Exit 0 **** End of file ****
output: 12345
This approach to selecting records
based on a restricted set or sets of characters may be of particular use, for
example; in selecting records that comprise only numeric data, alphanumeric
characters or a specific set of any mixture of characters, a valid email
address or a properly formatted telephone number, etc.. |
9. |
LOCATE verifies that its secondary
input stream is not connected and then begins execution. |
Examples |
1. |
'pipe literal /a-b-c/ | literal /d-e-f/ | locate wordsep /-/ w3 /c/ | console'
output:
a-b-c |
2. |
'pipe literal /a?b?/ | literal /e??f/ | locate fieldsep /?/ f2-3 /f/ | console'
output:
e??f |
3. |
'pipe literal /?ab?c??a ab?c?a/ | split | locate (ws /?/ w1 w3) /a/ | console'
output: ?ab?c??a
ab?c?a |
4. |
'pipe literal /afbc adef ghfi fjkl/ | split | locate -2;-1 /f/ | console'
output: adef
ghfi |
5. |
In this example, the LOCATE stage
selects only those records for students who received a grade of 100 on exam 2
or exam 3. studentgrades.txt (input) ...|...+....1....+....2....+....3....+....4....+....5....+.... **** Top of file **** 1 STUDENT EXAM 1 EXAM 2 EXAM 3 EXAM 4 2 3 CAROLYN 78 87 100 95 4 EVELYN 85 84 82 89 5 JACK 100 89 89 100 6 KEN 88 79
79 93 7 KAREN 90 100 100 95 8 MICHAEL 87 100 99 100 **** End of file **** The following pipeline inspects
the entries for exam 2 and exam 3 only. 'pipe < studentgrades.txt | locate (30-32
40-42) /100/ | console'
The resulting console output is shown below.
output: CAROLYN 78 87 100 95 KAREN 90 100 100 95 MICHAEL 87 100
99 100 The same result can also be
achieved by utilising the WORD (WORDSEPARATOR) capabilities of the inputrange specification: 'pipe < studentgrades.txt | locate (w3 w4)
/100/ | console |
6. |
To find all records in the file: fruit.txt that do
contain the character string apple
anywhere within the last 10 characters of the record. fruit.txt (input) ...|...+....1....+....2....+....3....+....4....5.... **** Top of file **** 1
Blueberry Strawberry Raspberry 2
Pear Apple 3
Peach Nectarine Plum 4
Orange Tangerine 5
Watermelon Cantaloupe Honeydew 6
Pineapple **** End of file **** 'pipe < fruit.txt | locate (anycase) -10;-1 /apple/ | console' output: Pear Apple Pineapple |
Related |
Version |
Date |
Action |
Description |
Pipelines |
1.3 |
??.??.2025 |
changed |
Application-wide rewrite. |
|
04.02.2012 |
added |
Support
for the numeric comparison operators: <, <=, ==, !=,
>= and >. |
||
1.1 |
09.05.2010 |
added |
Support
for the REGEXP operand; which specifies that the string operand is interpreted as a regular expression. |
|
1.0 |
06.09.2007 |
created |
First version. |
|