Pipelines by TenFiftyTwo

v2.0

 
Usage
Reference
 
 
Introduction
Syntax diagram structure
Stages
Terms
Pre-process commands
ASCII character set
Subcommands
Design Notes
The pipe command
History of change
    Substitution placeholders
Examples
    Specifying pipe options
Stage Command API
Multiple pipelines
Messages
Multi-stream pipelines
    StageManager
    Connecting streams
    Stage Command
        Connecting to a secondary output stream
Regular expression
        Connecting to a secondary input stream
 
        Connecting to both the secondary input and output streams
 
        Connecting several secondary streams
 
    How to predict relative record order
 
    Pipeline stalls
 
        Stall detection monitor
 
Tracing pipelines
 
 
 
 

Introduction

 

Pipelines is a utility that allows you to modify the contents of a text/data file or files, quickly and easily. You can specify that only certain sections of a file are to be changed; you can confine those changes to a column, word or field range, translate words and phrases, discard or insert new lines of text. You can perform a whole range of operations on a file or files, using only a simple set of commands.

 

Pipelines build’s on the concept of directing the output of one process to the input of another; commonly known as pipelining. However, Pipelines takes an extra step; allowing you to build multi-stream pipelines, where the topology is no longer horizontal and linear, but two-dimensional; where the records travel up and down the pipeline chain through intersecting joints which control the flow of data. With standard linear pipelines the data flows through each filter or stage, passing into the next and so on until it reaches a sink. Multi-stream pipelines on the other hand allow you to select and operate on specific sets of records; routing unselected records through a joint into and out of other sections of the pipeline. This allows you to join multiple pipelines together in configurations that address a whole range of transformation problems.

 

Pipelines comprises a range of input, output, selection and transformation stages which cover a broad range of manipulation functions; splitting records, stripping characters, joining records, collating and sorting and more. On the whole, similar operations are performed by a single stage; which means that you do not have to remember the names of an unnecessarily lengthy list of stages. For example; stripping characters from a record, Pipelines provides a single stage called STRIP which removes characters from the beginning and/or the end of a record.

 

The Pipelines syntax is very simple; it does not employ lists of terse /switches, but rather, an English-like syntax which is straight forward to read, for example:

 

**** Top of file ****
01 Address Rxpipe
02
03 ‘pipe literal /hello there/’,
04      ‘| translate uppercase’,
05      ‘| duplicate’,
06      | >> myfile.txt’
07
08 Exit 0
**** End of file ****
 

which reads as; the literal hello there is to be translated to uppercase, duplicated and the result appended to the file: myfile.txt.

 

With Pipelines, the pipeline can be specified on the system command-line, in a batch file or powershell script and in a Pipelines file, ext (.REX). You design the pipeline in your favourite editor and save it; to execute the pipeline you simply double click the file icon and Pipelines will launch it. You can specify pipelines which accept arguments which substitute stage operands and coupled with the capability to connect pipelines together this allows you build a range of utility pipelines that can be called upon whenever you need them.

 

You may find Pipelines of use in many cases where you might otherwise have to write a program to solve the problem and it may well save you some time and effort that could be better spent on other tasks.

 

Stages

 

Stage

Description

 

 

ARRAY

Gets or sets an ooRexx variable with the specified array.

< (FILEIN)

Reads records from a disk file.

> (FILEOUT)

Creates or replaces a disk file.

>> (FILEAPPend)

Creates or appends records to a disk file.

BETWEEN

Selects records between two specified target strings including the records containing the target.

BUFFER

Accumulates all records in a single stage not passing any on until all have been received.

CALLpipe

Connects to another pipeline.

CHANGE

Replaces a string of characters with another string of characters.

CHOP

Selectively truncates records.

COLLATE

Matches records from two input streams and writes matched and unmatched records to different output streams.

CONSole

Reads from or writes to the console.

COUNT

Counts bytes, whitespace-delimited character strings, or records.

DEAL

Writes a primary input stream record to one of its connected output streams in either sequential order; starting with the primary output stream, or some other order specified on the secondary input stream.

DELAY

Waits until a particular time of day or until a specified interval of time has passed to copy a primary input stream record to its primary output stream.

DROP

Discards one or more records.

DUPlicate

Writes each input record in addition to the specified number of copies of each input record.

ELASTIC

Puts a sufficient number of input records into a buffer to prevent a pipeline stall.

FANIN

Combines multiple input streams into a single stream in a specified order.

FANINANY

Combines multiple input streams into a single stream.

FANOUT

Copies primary input stream records to multiple output streams.

FILELIST

Searches for a disk file or files.

FROMLABel

Selects records that follow a specified target including the target record.

HOLE

Discards records.

IN

Reads records from STDIN or the CALLPIPE stage of the calling pipeline.

INSIDE

Selects records between two specified targets not including the records containing the target.

JOIN

Concatenates groups of records.

LITeral

Writes the specified data to the primary output stream and then copies primary input stream records to the primary output stream.

LOCate

Selects records that contain a specified string of characters.

LOOKUP

Finds records in a reference.

NLOCate

Do not select records that contain a specified string of characters.

OUT

Writes records to STDOUT or the CALLPIPE stage of the calling pipeline.

OUTSIDE

Selects records not located between two specified targets. The records containing the targets are not selected.

OVERlay

Reads a record from each input stream and merges the records read into a single record.

PAD

Extends records with one or more occurrences of a specified character.

PROGRESS

Produces an output record after a specified number of input records have been read.

RUNpipe

Runs another pipeline.

SHELLexecute

Executes a DOS/system shell or Powershell command or the specified executable process.

SNAKE

Builds a multicolumn page layout.

SORT

Arranges records in ascending or descending order.

SPECs

Rearranges the contents of records.

SPLIT

Splits records into multiple output records.

STEM

Gets or sets an ooRexx variable with the specified stem.

STRIP

Removes leading and/or trailing characters from records.

TAKE

Selects one or more records from the beginning or end of the primary input stream.

TOLABel

Selects records that precede a specified target not including the target record.

TRANSlate

Translates characters based on the specified translation table.

UNIQue

Compares the contents of adjacent records and discards or retains the duplicate records.

VARiable

Gets or sets an ooRexx variable.

 

Pre-process commands

 

Command

Description

 

 

CASEI

Perform a non-case-sensitive selection comparison.

ZONE

Restrict data selection to a specific column range.

 

Subcommands

 

Command

Description

 

 

OUTPUT

Writes a record to the specified output stream.

PEEKTO

Reads a record without removing the record from the input stream.

READTO

Reads a record.

 

The pipe command

 

A pipeline is loaded and launched by the pipe command. The command is specified in an ooRexx program which may use Pipelines to perform any number of input, output and transformation operations on input files, output files and ooRexx VARIABLES, STEMs and ARRAYs.

 

To create an ooRexx program; using your favourite text editor, simply write the ooRexx/pipeline save it with a file type of (.REX) and then you can run it time and again. You can also use the ‘New->Pipelines file’ entry in the Windows right-click popup menu to create an initial ooRexx/Pipelines template. You may specify the actual pipeline in a style that suits you; the following pipeline shows a previous example laid out in a vertical fashion:

 

**** Top of file ****
01 Address Rxpipe
02
03 ‘pipe < myfile.txt’,
04      ‘| take 10’,
05      ‘| strip trailing’,
06      ‘| locate’,
07      | > myfile.txt’
08
09 Exit 0
**** End of file ****
 

Once you begin to write multi-stream pipelines; specifying the pipeline in a vertical fashion; is a must.

 

You can launch an ooRexx/pipeline program by simply double clicking on the file icon. You do not have register a pipeline or update path definitions each time you create one; you decide how you want to organise your files and how you want to access them. Pipelines does not tie you into an interface that you may not enjoy using.

 

Substitution placeholders

 

Pipelines provides the following substitution place-holders; which may be specified in any position where a literal value can be specified:

 

&cwd

The fully qualified path of the current working directory.

 

&date

The current system date, in the format: yyyy/mm/dd

 

&installdrive

The disk-drive where Pipelines is installed.

 

&installpath

The directory path where Pipelines is installed.

 

&sysdrive

The Windows system boot/root disk-drive.

 

&time

The current system time, in the format: hh:mm:ss

 

&username

The name of the current log on User Account.

 

&version

The current Pipelines version, with:

 

TenFiftyTwo(c). Pipelines [q.r for .NET x.y]

 

Specifying pipe options

 

Pipelines allows you to specify options which control the way the pipeline is interpreted and the way the pipeline operates.

 

By default, the Pipelines parser recognises the (|) and (.*) characters as reserved values which define; the stage delimiter stagesep and the start of a comment comment. These values are pre-defined and default. In addition, you can specify characters and values which identify a range of options and settings.

 

The following table describes the control option values:

 

Control option

Default value

Chars

Description

 

 

 

 

COMment

.*

2

A comment starts with the specified characters and ends at the end-of-the-line.

ENDchar

(none)

1

Defines the end of one pipeline and the start of another. There is no default; you must specify a value to create a multi-stream pipeline.

ESCape

(none)

1

Defines a character that escapes the character which immediately follows. There is no default; the escape character allows you to specify a character that would otherwise be interpreted as a normal character value.

MONitor

ON

2-3

Specifies whether the stall detection MONITOR is ON or OFF. By default stall detection is automatically activated (for multi-stream pipelines) when Pipelines begins. However you may wish to disable the feature when you know that a pipeline does not/or should not cause a pipeline stall. See: Pipeline stalls for an explanation.

 

The following table shows the MONITOR settings and their meaning:

 

Setting

Meaning

Comments

ON

Stall detection is on

Stall detection is enabled (for multi-stream pipelines); this is the default.

OFF

Stall detection is off

Stall detection is disabled. For untested multi-stream pipelines this may have unpredictable results; the StageManager will not report a stall when one exists and the pipeline will hang without explanation.

 

You cannot specify more than one MONITOR option in any one set of pipelines.

NAMe

(none)

1-n

Specifies the name of the pipeline. The name cannot contain any characters that are reserved or already assigned.

PRIority

4

1

Specifies the thread priority of a pipeline; by default the operating system assigns a priority which corresponds to the Pipelines default level of 4.

 

The following table shows the PRIORITY levels and their meaning:

 

Level   

Meaning

Comments

1

Real-time

Use this level with care (for multi-stream pipelines); a pipeline running with a real-time thread priority may cause the pipeline to stall; the pipeline may consume all of the available CPU time, interfering with the normal operation of the operating system; delaying the dispatch of other called pipelines or causing disk caches to not flush, the mouse to ‘hang’ and so on.

2

High

This level is the highest that should be specified on a pipeline which contains a CALLPIPE or RUNPIPE stage command. PRIORITY level 1 is not suitable for pipelines which call or launch external pipelines.

3

Above Normal

Specifies a level which is greater than the normal, default level.

4

Normal

Specifies the normal operating system level for applications; this is the default.

5

Below normal

Specifies a level which is less than the normal, default level.

6

Low

Specifies the lowest possible level. Use this level with care (for multi-stream pipelines); a pipeline running with an idle thread priority may cause the pipeline to stall; as system service is intermittent.

 

You cannot specify more than one PRIORITY option in any one set of pipelines.

REGex

2

1

Specifies the regular expression mode; by default Pipelines assigns a mode of 3, which corresponds to the ECMA expression syntax.

 

The following table shows the REGEX modes and their meaning:

 

Type

Mode

Comments

1

BRE

Equivalent to the POSIX basic expression syntax.

2

ERE

Equivalent to the POSIX extended expression syntax.

3

ECMA

Equivalent to the regular expression syntax of the ECMA-262 standard, the implementation commonly known as JavaScript.

4

grep

Equivalent to the traditional UNIX grep command syntax.

5

egrep

Equivalent to the traditional UNIX egrep command syntax.

6

awk

Equivalent to the awk expression syntax, the pattern matching syntax of UNIX/LINUX system expressions.

 

For a detailed description of the exact syntax feature of each flavour, see: here.

 

stageSEP

| (vertical-bar)

1

Defines the end of one stage and the start of another.

TRaCe

OFF

2-3

Specifies whether pipeline tracing is ON or OFF. By default; tracing is set to OFF. However; you can specify tracing on a stage by stage basis. See Tracing pipelines for a detailed description.

 

The following table shows the TRACE settings and their meaning:

 

Setting

Meaning

Comments

ON

Tracing is turned on.

TRACE is turned on for all the pipelines and stages in the set of pipelines.

OFF

Tracing is turned off.

TRACE is turned off for all pipelines and stages in the set of pipelines.

 

You cannot specify more than one TRACE option in any one set of pipelines.

 

To define or re-define a control option value (except the priority, monitor and trace options which can only be defined once) you simply enclose the definition(s) in parentheses at the beginning of each pipeline that you specify. (Note. you can only use an argument place-holder to substitute for a stage name or stage argument, you cannot substitute a control option definition).

 

The following two sets of pipelines perform identical tasks, however; the first pipeline of the second set re-defines the comment and stagesep characters and the second in the set re-defines the stagesep character again. (A control character remains in effect until a re-definition is encountered; this includes multiple pipelines in the same set).

 

**** Top of file ****
01 Address Rxpipe
02
03 /* The first and second pipelines use the default comment and stagesep control values. */
04
05 ‘pipe (endchar ?)’             /* Define the endchar value. */
06      < in1.txt’,              /* Input file. */
07      ‘| > out1.txt’,           /* Output file. */
08      ‘?’,                      /* endchar (start of second pipeline). */
09      ‘< in2.txt’,              /* Input file. */
10      ‘| locate’,               /* Discard blank lines. */
11      ‘| > out2.txt’            /* Output file. */
12
13 Exit 0
**** End of file ****
 
 
**** Top of file ****
01 Address Rxpipe
02
03 /* The first pipeline defines the comment, endchar and stagesep values.
04    The second pipeline inherits the comment value, but re-defines the stagesep value.  */
05 
06 ‘pipe (endchar ? stagesep @)’,           /* Define comment, endchar and stagesep value. */
07      ‘< in1.txt’,                        /* Input file. */
08      ‘@ > out1.txt’,                     /* Output file. */
09      ‘?’,                                /* endchar (start of second pipeline). */
10      ‘(stagesep +)’,                     /* Re-define the stagesep value. */
11      ‘< in2.txt’,                        /* Input file. */
12      ‘+ locate’,                         /* Discard blank lines. */
13      ‘+ > out2.txt’,                     /* Output file. */
14
15 Exit 0
**** End of file ****

 

The escape control option can be used when you need to specify a character that is the same as one of the currently defined control characters. For example; you may be designing a pipeline, when you find that you need to specify the (|) vertical-bar character in a stage argument. But, the | character has already been assigned. To specify the character without having to change the control definition; you simply prefix the | with the escape character, For example:

 

**** Top of file ****
01 Address Rxpipe
02
03 ‘pipe (stagesep | escape %)’,
04      ‘< myfile.txt’,
05      ‘| locate /%|text%|/’,
06      ‘| > cons’
07
08 Say ‘Hit Enter to close..’
09 Parse Pull
10
11 Exit 0
**** End of file ****
 

The pipeline parser treats the character which immediately follows the escape character as an ordinary ASCII value and not a control character. If you specify an escape character and you need to use that same character in a stage argument; simply use two of them together, for example: %%.

 

Multiple pipelines

 

The term pipeline refers to the chain of stages that make up the path through which a record flows; usually starting from an input file and ending with an output file or files. The pipeline can contain as many stages or pipelines as is needed to perform the transformation. With Pipelines; you can specify multiple pipelines in the same set, that is to say; you can specify any number of pipelines which may operate together or independently. Consider the following example; which comprises two pipelines operating independently; the first copies in1.txt to out1.txt and the second copies in2.txt to out2.txt. Both pipelines are specified in the same Pipelines file, but they are launched and serviced by the StageManager as separate entities; this is the reason Pipelines provides the control option definitions.

 

**** Top of file ****
01 /* Both pipelines copy files. */
02
03 Address Rxpipe
04
05 ‘pipe (endchar ?)’,
06      < in1.txt’,               /* Input file. */
07      ‘| > out1.txt’,            /* Output file. */
08      ‘?’,                       /* endchar character (end of first.. */
09       ,                         /* ..pipeline, start of second). */
10      ‘< in2.txt’,               /* Input file. */
11      ‘| locate’,                /* Discard blank lines. */
12      ‘| > out2.txt’             /* Output file. */
13
14 Exit 0
**** End of file ****
 

In the example; the two pipelines are separated by the endchar character (?) and they perform no exchange of records at all; there is no intersection between them. Both pipelines are executed, but they have no dependency on each other (unless a runtime error occurs, in which case the StageManager issues a quiesce and both pipelines are terminated). You can specify any number of pipelines in the same set and run them as a job lot; You simply stack them one on top of another, separated by the endchar character.


 

Multi-stream pipelines

 

By default; the StageManager connects adjacent stages together through a primary input stream and a primary output stream (a primary input stream connects to the primary output stream of the preceding stage). However, many of the builtin stages’ also allow you to connect to a secondary input or from a secondary output stream and in a few cases a tertiary input or output stream. These additional streams define a multi-stream pipeline; they allow you to operate on data in ways that would otherwise require multiple passes over the incremental refinement of the output of one pipeline after another or by isolating the set of target records and having to sort the entire output in order to restore the relative record order. With a multi-stream pipeline; the data is operated on once and the relative record order is maintained. This capability allows you to handle large amounts of data without having to acquire the enormous quantity of memory-storage needed to sort it.

 

For example; you might want to change the word hello to goodbye only in records that contain the word friend. The following multi-stream pipeline does this in a single pass:

 

**** Top of file ****
01 Address Rxpipe
02
03 ‘pipe (endchar ?)’,                /* Define the endchar. */
04      < myfile.txt’,               /* Read input file and select records.. */
05      ‘| a: locate /friend/’,       /* ..that contain ‘friend’, unselected records are.. */
06       ,                            /* ..routed to the second pipeline through label a:  */
07      ‘| change /hello/ /goodbye/’, /* Make the change. */
08      ‘| b: faninany’,              /* Read primary and secondary streams (through label b:). */
09      ‘| > myfile.txt’,             /* Write to output file. */
10      ‘?’,
11      ‘a:’,
12      ‘| take *’,                   /* Read records through label a: */
13      ‘| b:’                        /* Route them back to first pipeline through label b: */
14
15 Exit 0
**** End of file ****

 

Connecting streams

 

When more than one input or output stream is used, we no longer have a map in which all the stages are arranged in a straight line, instead, we have a multi-stream pipeline; where the topology is no longer horizontal and linear, but two-dimensional; where the records travel forwards and backwards through intersecting joints which control the flow of data, as in the previous example. To use more than one input or one output stream, or a combination, we need to write multiple pipelines in a single pipe command. To connect to a stage's multiple streams; first put a label in front of the stage command whose secondary input or output stream you want to use; this defines the label. Then put a matching label elsewhere in the pipeline in a stage by itself (see the following figure). This is called the label reference. A label must be defined in the pipeline before any reference can be made to it; in the following example, the label a: connects the two pipelines.

 
pipe (endchar ?) < test.txt | a: locate /BOB/
     | > bob.txt
     ?
     a: | > notbob.txt
 

The location of matching labels determines what connections are made between the stages. The matching label can be in one of three positions:

 

1)

At the beginning of a pipeline.
In this case, Pipelines makes connections to the secondary output stream of the stage which defines the label.

 

2)

At the end of a pipeline.
In this case, Pipelines makes connections to the secondary input stream of the stage which defines the label.

 

3)

In the middle of a pipeline.
In this case, Pipelines makes connections to the secondary input and secondary output of the stage which defines the label.

 

Connecting to a secondary output stream

 

 

The diagram shows that the secondary output of stage-B is connected to the primary input of stage-D. To write a pipeline that performs this connection:

 

First, write two pipelines in a single pipe command:

 

pipe (endchar ?)
     stage-A | stage-B | stage-C           .* First pipeline.
     ?                                     .* Endchar.
     stage-D                               .* Second pipeline.

 

Then define the label a: by putting it in front of stage-B:

 

pipe (endchar ?)
     stage-A | a: stage-B | stage-C        .* First pipeline.
     ?                                     .* Endchar.
     stage-D                               .* Second pipeline.

 

Finally, connect the secondary output of stage-B to the primary input of stage-D by putting the matching label at the beginning of the second pipeline.

 

pipe (endchar ?)
     stage-A | a: stage-B | stage-C        .* First pipeline.
     ?                                     .* Endchar.
     a: stage-D                            .* Second pipeline.

 

Connecting to a secondary input stream

 

 

The diagram shows the primary output stream of stage-D connected to the secondary input of stage-B. The following example shows how to make the connection:

 

pipe (endchar ?)
     stage-A | a: stage-B | stage-C
     ?
     stage-D | a:

 

The label a: is defined by stage-B. Because the matching label is used at the end of the second pipeline, the primary output stream of stage-D is connected to the secondary input stream of stage-B. Therefore, any records that stage-D writes will flow into the secondary input of stage-B.

 

Connecting to both the secondary input and output streams

 

 

The diagram shows how stage-B connects to both its secondary input and secondary output streams. The following example shows how to make the connections:

 

pipe (endchar ?)
     stage-A | a: stage-B | stage-C
     ?
     stage-D | a: | stage-E

 

The records from stage-D flow into the secondary input of stage-B, while the records from the secondary output of stage-B flow into stage-E. Records do not flow from stage-D to stage-E.

 

Connecting several secondary streams

 

When multiple stages are used that write to or read from secondary streams, you simply use a different label for each.

 

 

The diagram shows how you would write all the records containing the string BOB to one file, of the remaining records, you want to write those containing SUE to another file, and all other records to a third file. The following pipeline performs that task:

 

**** Top of file ****
01 ‘pipe (endchar ?)’,
02      < test.txt’,
03      ‘| a: locate /BOB/’,
04      ‘| > bob.txt’,
05      ‘?’,
06      ‘a:’,
07      ‘| b: locate /SUE/’,
08      ‘| > sue.txt’,
09      ‘?’,
10      ‘b:’,
11      ‘| > other.txt’
12
13 Exit 0
**** End of file ****

 

How to predict relative record order

 

Multi-stream pipelines introduces an issue that does not occur with the linear pipeline model and that is; relative record order.

 

Although the order in which the StageManager dispatches the stages is unpredictable; in certain situations the relative output order of the records in a multi-stream pipeline is predictable. A stage does not run from start to finish once the StageManager dispatches it. When a stage writes a record to its output stream, the stage becomes suspended; which means that it cannot run again until the stage connected to its output stream has consumed the record. A stage consumes the record when it reads that record from its input stream and removes it from that stream. Consider the following:

 

pipe A | B | C | . . .

 

When stage A writes a record to its output stream; stage A stops running until the record is consumed by stage B. Therefore, stage B determines when stage A can continue to run. Stage B reads the record from its input stream and writes it to its output stream; stage B stops running until the record is consumed by stage C. Stage C now determines when stage B can continue to run, and so on. When the record finally reaches a stage which consumes the record from its input stream; the pipeline chain unravels and the stages are able to run again. This mechanism ensures that records that travel through a multi-stream pipeline (which may involve any number of joints in the pipeline chain) are delivered in the correct relative record order. A stage may process its records in the following two ways:

 

A stage may read a record and remove it from the input stream and write this record to its output stream; the stage delays the records.

A stage may read a record without removing the record from its input stream and write this record to its output stream; the stage does not delay the records.

 

In order to maintain the relative order of records in a multi-stream pipeline, the pipeline must:

 

Start at one common stage.

Be split into multiple pipelines using only stage commands that do not delay the records.

Contain only stages that do not delay the records.

Be combined into a single stream using a stage that combines multiple streams as records arrive (for example, FANINANY)

 

If a pipeline follows these simple rules; (reference the Stages documentation to determine whether a stage delays the records or not) the output order of the records in a multi-stream pipeline is predictable.

 

Pipeline stalls

 

With multi-stream pipelines a stall may occur. A stall occurs when the StageManager cannot run any of the stages because every stage is waiting for some other stage to perform some function. Usually, stalls are caused by stages that read multiple input streams in a particular order or that need records to be available on more than one stream at the same time; because the preceding stages do not deliver records in the order needed or do not provide multiple records concurrently. As soon as a stall occurs; Pipelines issues an error message and the current status of each stage is written to a stall-file, Pipelines then terminates.

 

The stall-file is written to the current working directory. See the following example.

 

You can inspect the dump-file to see which stages are waiting to read or write records and which streams might be causing the deadlock. First, look at any stages that have multiple input streams. Of these stages; identify any stage that needs records in a particular order (such as FANIN) or that need more than one record at a time (such as SPECS and OVERLAY). Look for earlier stages that have secondary output streams. These stages often deliver records in a particular order. A common stall is shown in the next diagram.

 

 

FANOUT writes a record to output stream 0. FANIN reads this record. Then FANOUT tries to write a copy of the record to output stream 1; waiting for FANIN to read the record, but FANIN is waiting for FANOUT to write another record on stream 0. FANIN will not read from input stream 1 until input stream 0 is disconnected. The pipeline is stalled.

 

The BUFFER and ELASTIC stages can be used to fix this situation. BUFFER reads all of the records on its input stream to a buffer, then, when its input is empty, it writes these records to its output stream. You can use a BUFFER stage to prevent a pipeline stall; as in the following diagram:

 

 

The BUFFER stage allows the FANOUT stage to continue writing records; as it consumes all of the input records before it writes any.

 

 

Stall detection monitor

 

Pipelines employs a stall detection mechanism (referred to here as the stall detection monitor) which is responsible for identifying when a pipeline has become stalled. The monitor employs a pacing strategy which inspects the record throughput of a pipeline in relation to the pipeline thread priority setting (which can range from idle to real-time); however, this pacing strategy cannot anticipate the moment by moment change in system load and usage in a multi-threaded, multi-user windows environment (even Windows occasionally flags an application as ‘not responding’ when it is simply waiting to be serviced). As such; there may be times when Pipelines detects a stall when clearly there should not be one; the pipeline is correctly formed and there is no contention in the pipeline stream configuration and yet the stall is raised and the pipeline is terminated (which should only occur when running with an idle or real-time thread priority setting or when the system load is extremely high). To avoid the stall you can turn the stall detection mechanism off with the ‘MONITOR OFF’ option.

 

As a rule; stall detection is only necessary the first time a multi-stream pipeline is run; Pipelines activates the stall detection mechanism (for multi-stream pipelines only) automatically, however, it is not required once a pipeline has proved that it is well formed and runs successfully from beginning to end.

 

Be mindful that with stall detection turned off, you may see a slight increase in performance, however; you should not specify the ‘MONITOR OFF’ option for a pipeline that is untested; the result will be unpredictable!

 

Tracing pipelines

 

Often, it can be quite challenging to get to grips with a pipeline that does not generate output records in a format that you might expect or indeed to simply study a pipeline in order to understand how it processes its input and output records. In addition; a pipeline that processes records that contain extended or non-displayable ASCII characters is especially difficult to debug. Tracing a pipeline can help enormously when you want to understand the structure of the records that flow into and out of all or some of the stages that define the pipeline. The TRACE option may go some way to making this a little easier.

 

When you specify the TRACE ON option; the first pipeline in the set of pipelines; begins the trace by issuing the TRACE Usage message, followed by a request for input, for example; consider the following:

 

C:\mypipe.rex
 
Usage:
 
  <enter>       - run the next input/output record request.
  End           - stop tracing and quiesce the pipeline.
  HeLP          - display this help list.
  Off           - stop tracing and allow the pipeline to run to completion.
  SouRCe        - display the source pipeline.
  Run <count|*> - run <count> number or all of the next input/out
 
trace?: _

 

You can now enter any of the commands listed in the Usage message, above.

 

If you intend to trace a pipeline which comprises a number of pipelines, or one which calls another pipeline, it can make following the trace message output much easier if you ensure that when you develop a pipeline; you assign it a unique name that details the name of the ooRexx program and pipeline, as specified by the NAME option keyword. Each time Pipelines receives a read or write record request; it will display a trace message detailing the pipeline/stage and input/output record.

 

Consider the following example TRACE message:

 

0000000001 Pipe:(0,mypipe),Stage:(0),Type:(IN): Peek Record: 1, Stream: STDIN, Length: 32 bytes.
Data: > Volume in drive C has no label.<
      >| - - - + - - - - | - - - - + - - - - | - - - - + - - - - | - -
         V o l u m e   i n   d r i v e   C   h a s   n o   l a b e l .
       20566F6C756D6520696E206472697665204320686173206E6F206C6162656C2E

 

In the example above; the input record contains only displayable ASCII values, and the ‘Data:’ section of the message displays the text as-is. However, when the data contains extended or non-displayable ASCII character values - in order to align the ASCII and hexadecimal representations; non-displayable values (including a horizontal TAB (x09) ) are replaced by a SPACE (x20).

 

While the TRACE option keyword is useful when you want to trace an entire pipeline or set of pipelines, sometimes you may want to trace only one or two or more, specific stages. In this case; you can specify the TRACE option on a stage by stage basis. Consider the following example:

 

**** Top of file ****
01 ‘pipe filelist’,
02      trace locate /DIR/’,
03      ‘|console
04
05 Say ‘Hit Enter to close..’
06 Parse Pull
07
08 Exit 0
**** End of file ****

 

In this case; only the LOCATE stage will be traced.

 

In order to understand the way in which TRACE inheritance and precedence works; the following notes apply:

 

If pipeline A (and this includes multiple pipelines in the same set) specifies the TRACE ON option; all the stages in that set will be included in the trace.

If pipeline A - which specifies the TRACE ON option - calls pipeline B, and pipeline B does not specify the TRACE OFF option; pipeline B will be included in the trace; the inherited TRACE ON option is adopted by pipeline B.

If Pipeline A specifies the TRACE option on any one of its stage commands, and pipeline A calls pipeline B; pipeline B will not be included in the trace – the TRACE definition is at the stage level and is not inherited by pipeline B.

The TRACE ON or TRACE OFF option overrides any inherited TRACE setting.

A pipeline launched by the RUNPIPE or SHELLEXECUTE stage command does not inherit its caller’s TRACE setting.