LOCATE stage v1.3

Pipelines v2.1

 

Purpose, Operands, Streams, Usage, Examples, Related

Home

 

Syntax

 

                         ┌─1-*────────────────┐
>>──LOCate─────────────┼────────────────────┼──┤ Set ├──┤ String ├─────────────────><
            └─ANYCase─┘  inputrange─────────
                            ┌─<──────────┐   
                         └─(─inputrange─)─┘
 

Set:

 

───────────────────────────────────────────────────────────────────────────────────

                ┌─BOTH───┐ │

   └─TAKE─count──┼────────┼─┘

                 BEFORE
                 └─AFTER──┘

 

String:

 

───────────────────────────────────────────────────────────────────────────────────
   │ ┌─STRing──┐        
   └─┼─────────┼─string─┘
     REGex───
     PATtern
     └─ANYof───┘

 

Notes:
(1) If you specify ANYCASE, multiple inputrange operands or any of the string operand 
    operands; you must provide a string to locate.
(2) You cannot specify a Set of records without specifying other operands.

 

Purpose

 

Use the LOCATE stage to select records that contain the specified target string of characters. LOCATE writes primary input stream records that contain the specified string to its primary output stream. If its secondary output stream is connected; LOCATE writes the unselected input records to its secondary output stream; otherwise they are discarded.

 

LOCATE searches for the string within one or more locations of the input record. If you do not specify an inputrange, LOCATE searches the entire input record. When a single inputrange location is specified, and you do not specify the string operand or you specify a null string; LOCATE writes to its primary output stream only input records of length inputrange or greater.

 

LOCATE can select a Set of records which precede and/or follow the target record; writing each Set of records to the primary output stream. Records which are not selected are written to the secondary output stream, if it is connected; otherwise they are discarded. Records that are selected as part of Set are prefixed with a record offset number; a ten digit right-aligned number which represents the number of the record which precedes or follows the record containing the target string. Records which precede the target are negative and those which follow are positive.

 

Operands

 

ANYCase

specifies that when LOCATE compares the contents of an input record specified by inputrange and the target string, the comparison made is non-case-sensitive.

 

inputrange

is an integer column, word or field range on which to operate.

 

TAKE

 

count

is an unsigned integer which specifies the number of records to select before and/or after a target record.

 

BOTH

specifies that count number of records which precede and follow a record that contains the target string are also selected. This is the default.

 

BEFORE

specifies that count number of records which precede a record that contains the target string are also selected.

 

AFTER

specifies that count number of records which follow a record that contains the target string are also selected.

 

STRing

specifies that the string operand is a literal string of characters to locate.

 

REGex

specifies that the string operand is a regular expression of characters to locate.

 

PATtern

specifies that the string operand is a pattern of characters to locate.

 

ANYof

specifies that the string operand is a list of characters, any of which are to be located.

 

 

string

is a string to locate.

 

Streams used

 

The following streams are used by the LOCATE stage:

 

Stream

 

Action

 

 

Primary input stream

LOCATE reads records from its primary input stream.

Primary output stream

LOCATE writes the records which are selected to its primary output stream.

Secondary output stream

If it is connected, LOCATE writes the records which are not selected to the secondary output stream.

 

Usage notes

 

1.

LOCATE without the count operand does not delay the records. LOCATE used with the count and BOTH or BEFORE operands; delays every count number of records.

 

2.

If the LOCATE stage discovers that its primary input stream is not connected, the LOCATE stage ends.

 

3.

If no operands are specified before string, and if string consists of only decimal numbers (0-9), you cannot specify a left parentheses or a number as the delimiting character. For example:

 

LOCATE /5/
 

is not equivalent to

 

LOCATE (5(
 

The first LOCATE stage selects records that contain the string 5. The second stage results in an error message, because (5( is processed as a number range rather than a delimited string.

 

4.

If no inputrange is specified; you cannot specify an asterisk (*) as the delimiting character for string when it consists of only a hyphen (-).

 

5.

If you specify the TAKE count operands; LOCATE will always finish selecting a Set of records before it begins searching for a new occurrence of the target string, for example:

 
   **** Top of file ****
 1 Address Rxpipe
 2
 3 'pipe lit /A B C D D D E F G/',
 4    '| split',
 5    '| locate anycase take 2 both str /d/',
 6    '| cons'
 7
 8 Exit 0
   **** End of file ****
 
output:
        -2 B
        -1 C
         0 D
         1 D
         2 D
 

In the example above; the second and third occurrences of D are treated as the first and second records in the Set of records which follow the target record and not the start of another Set.

 

6.

You can use the LOCATE stage followed by an NOTLOCATE stage to select records of a particular length. For example, the following pipeline displays only those records of the file; myfile.txt that are exactly 20 characters long:

 

'pipe < myfile.txt | locate 20 | notlocate 21 | console'
 

7.

If you want to remove blank lines from a file; use the STRIP stage to remove leading and trailing spaces, and then use LOCATE with no operands to search for records of length 1 or greater. For example:

 

'pipe < myfile.txt | strip | locate | > myfile.txt'
 

8.

You may want to locate/identify records or sections of a record that comprise only the specified characters. The following example demonstrates how you might use the regular expression capabilities of the LOCATE stage to VERIFY that a record only contains numeric digits.

 

   **** Top of file ****
 1 Address Rxpipe
 2
 3 'pipe lit /12345 12T45/',
 4      '| split',
 5      '| locate reg /^[0-9]+$/',
 6      '| cons'
 7
 8 Exit 0
   **** End of file ****
 
output:
12345

 

This approach to selecting records based on a restricted set or sets of characters may be of particular use, for example; in selecting records that comprise only numeric data, alphanumeric characters or a specific set of any mixture of characters, a valid email address or a properly formatted telephone number, etc..

 

9.

LOCATE verifies that its secondary input stream is not connected and then begins execution.

 

Examples

 

1.

'pipe literal /a-b-c/ | literal /d-e-f/ | locate wordsep /-/ w3 /c/ | console'
 
output:

a-b-c

 

2.

'pipe literal /a?b?/ | literal /e??f/ | locate fieldsep /?/ f2-3 /f/ | console'
 
output:

e??f

 

3.

'pipe literal /?ab?c??a ab?c?a/ | split | locate (ws /?/ w1 w3) /a/ | console'
 
output:
?ab?c??a

ab?c?a

 

4.

'pipe literal /afbc adef ghfi fjkl/ | split | locate -2;-1 /f/ | console'
 
output:
adef

ghfi

 

5.

In this example, the LOCATE stage selects only those records for students who received a grade of 100 on exam 2 or exam 3.

 

studentgrades.txt (input)

 

...|...+....1....+....2....+....3....+....4....+....5....+....

   **** Top of file ****

 1 STUDENT           EXAM 1    EXAM 2    EXAM 3    EXAM 4

 2

 3 CAROLYN             78        87       100        95

 4 EVELYN              85        84        82        89

 5 JACK               100        89        89       100

 6 KEN                 88        79        79        93

 7 KAREN               90       100       100        95

 8 MICHAEL             87       100        99       100

   **** End of file ****

 

The following pipeline inspects the entries for exam 2 and exam 3 only.

 

'pipe < studentgrades.txt | locate (30-32 40-42) /100/ | console'

 
The resulting console output is shown below.
 
output:

CAROLYN             78        87       100        95

KAREN               90       100       100        95

MICHAEL             87       100        99       100

 

The same result can also be achieved by utilising the WORD (WORDSEPARATOR) capabilities of the inputrange specification:

 

'pipe < studentgrades.txt | locate (w3 w4) /100/ | console

 

6.

To find all records in the file: fruit.txt that do contain the character string apple anywhere within the last 10 characters of the record.

 

fruit.txt (input)

 

...|...+....1....+....2....+....3....+....4....5....

   **** Top of file ****

 1 Blueberry Strawberry Raspberry

 2 Pear Apple

 3 Peach Nectarine Plum

 4 Orange Tangerine

 5 Watermelon Cantaloupe Honeydew

 6 Pineapple

   **** End of file ****

 

'pipe < fruit.txt | locate (anycase) -10;-1 /apple/ | console'

 

output:

Pear Apple

Pineapple

 

Related

 

NOTLOCATE

 

History

 

Version

 

Date

Action

Description

Pipelines

1.3

??.??.2025

changed

Application-wide rewrite.

2.1

1.2

04.02.2012

added

Support for the numeric comparison operators: <, <=, ==, !=, >= and >.

2.0

1.1

09.05.2010

added

Support for the REGEXP operand; which specifies that the string operand is interpreted as a regular expression.

1.7

1.0

06.09.2007

created

First version.

1.0