The Stack
In this chapter, the stack and operations manipulating the stack are discussed. Since the stack is external to the REXX language, there are large differences between implementations with respect to the stack. These differences are attempted described in the latter part of this chapter.
Another goal of this chapter is to try to describe both the "real" standards and some of the most commonly used de facto standards related to stack operation. Where something is not a part of any defined standard, this is clearly labeled. Also, some liberties have been taken in order to create a coherent vocabulary on a field where very little standardization has taken place.
In the various definitions of REXX, there are numerous references to the "stack" (often called the "external data queue", or just the "queue"). It is a structure capable of storing information, but it is not a part of the REXX language itself. Rather, it is a part of the external environment supporting a REXX implementation.
Originally, the references to the stack was introduced into REXX because of the strong binding between REXX and IBM mainframes in the early history of REXX [BMARKS]. Most (all?) of the operating systems for these machines support a stack, and many of their script programming idioms involve the stack. Therefore, it was quite natural to introduce an interface to the stack into REXX, and consequently today many of the programming paradigms of REXX involve a stack.
Unfortunately, this introduced an element of incompatibility into REXX, as the stack is not in general supported for other operating systems. Consequently, REXX implementors often must implement a stack as well of the core REXX interpreter. Since no authoritative definition of the stack exists, considerable differences between various implementations. Ironically, although the stack was introduced to help communication between separate programs, the interpreter-specific implementations of stacks may actually be a hindrance against compatibility between different interpreters.
The stack may have "seemed like a good idea at the time", but in hindsight, it was probably a bad move, since it made REXX more dependent on the host operating system and its interfaces.
This section describes the functionality generally available in implementations of stacks. The basic functionality described here will be complemented with information on specific implementations later. Unless explicitly labeled otherwise, this functionality is available in all standards treated in this documentation.
Below is listed the general functionality of the stack, in order of decreasing compatibility. I.e. the functionality listed first is more likely to be a part of all implementations than the ones listed at the end of the list.
The stack is a data structure, which strings can either be inserted into or extracted from. The strings in the stack are stored in a linear order. Extraction and insertion works at a granularity of a complete string, i.e. it is not possible to insert or extract parts of string.
The stack has two ends: a top and a bottom. New strings can be inserted into the stack in both ends, but strings can only be extracted from the top of the stack.
There exists a way of counting the number of strings currently stored in the stack.
A stack is often compared with the pile of plates you often find in cantinas. It allows you to either add new plates at the top of the pile or take old plates from the top. When a plate is taken from the pile, it will be the most recently plate (that is still present) added to the pile. Stack operating in REXX work the same way, although there also allow "plates" to be added to the bottom of the pile.
There might be an implementation-specific limit on the length and number of strings stored in the stack. Ideally, the maximum length will be fairly large, at least 2**16, although some implementations are likely to enforce shorter limits. Similarly, there might be a limit on the number of strings that can be simultaneously stored in the stack. Ideally, there should be no such limit.
It is natural that there are limits imposed on the amount of memory occupied by the strings in the stack. Some implementations are likely to reserve a fixed (but perhaps configurable) amount of memory for this purpose while others can dynamically re-size the stack as long as enough memory is available.
Some implementations might restrict the set of characters allowed in strings in the stack, although ideally, all characters should be allowed, even characters normally used for end-of-line or end-of-string.
This documentation use the term "string", while "line" is in common use elsewhere. The term is used because the strings in the stack are not inherently interpreted as lines (having an implied end-of-line), only as a string.
Note that the stack itself is not a part of REXX, only the parts which interface to the stack.
Example: Using the stack to transfer parameters
This is a common REXX idiom used in several situations for special parameter passing. The following code illustrates its use:
do i=1 to 10 /* for each parameter string */
queue string.1 /* put the string on the stack */
end
call subrout 10 /* call the subroutine */
exit
subrout: procedure /* the definition of the subroutine */
do j=1 to arg(1) /* for each parameter passed */
parse pull line.j /* retrieve the parameter */
end
... /*do something with the parameters*/
return
In this example, ten parameter strings are transferred to the subroutine SUBROUT. The parameters are stored in the stack, and only the number of parameters are transferred as a "real" argument.
There are several advantages: first, one avoids problems related to exposing variable names. Since the data is stored on the stack, there is no need to refer to the variable names and bind the variables in the subroutine to variables in the caller routine. In [TRL1], indirect references to variables in PROCEDURE EXPOSE is illegal, and this method circumvent the problem.
Two other ways around this problem is to use INTERPRET for the PROCEDURE EXPOSE instruction in order to dynamically determine which variables to expose; or to use the VALUE() built-in function (with its two first parameters). The former is incompatible with TRL2, while the latter is incompatible with TRL1. Using the stack can solve the problem in a fashion compatible with both standards. Anyway, if the called routine is an external routine, then exposing does not work, so using the stack to transfer values may be the only solution.
Another advantage of this idiom; TRL only requires implementations to support 10 parameters for subroutines. Although there are no reasons why an implementation should set a limit for the number of parameters a routine can get, you should use another mechanism than arguments when the number of strings is greater than 10. Using the stack fixes this.
As already mentioned, the stack is a linear list of strings. Obviously, this list has two ends. Strings can only be extracted from one end, while strings can be added to both ends.
If a set of new strings are added to the same end as they are later extracted from, the strings will be extracted in the reversed order with respect to the order in which they were added. This is called stacking "LIFO", which means "last-in-first-out", meaning that the last string stacked, will be the first string extracted, i.e. reversal of the order.
Similarly, when a set of strings are stacked in the end opposite to the end which they are later extracted from, they will be extracted in the same order in which they were stacked. This is referred to as "FIFO" stacking, meaning "first-in-first-out".
The FIFO method of stacking is also sometimes referred to as "queueing", while the LIFO method is sometimes referred to as "stacking" or "pushing".
The concept of buffers and everything directly related to buffers lay without the domain of standard REXX. Thus, this section describes a de facto standard.
Note that Regina supports multiple buffers only in internalstacks.
Some implementations support "buffers", which are a means of focusing on a part of the stack. When creating a new buffer, the old contents of the stack is somewhat insulated from the effects of stack operations. When the buffer is removed, the state of the old buffer i restored, to some extent: Whenever a string is read from the stack, and the topmost buffer on the stack is empty, then that buffer will be destroyed. Consequently, if this situation has arisen, dropping buffers will not restore the state of the stack before the buffer was created.
The functionality of buffers, and their effect on other stack operations may differ considerably between implementations.
Whenever a queuing operations is performed (e.g. by the QUEUE instruction), then the new string is inserted into the bottom of the topmost buffer, not the bottom of the stack. This is the same if the stack has no buffers, but else, the outcome of the queuing operation can be very different.
With IBM mainframe operating systems like CMS, buffers can be inserted on the top of the stack. To perform buffer operations, operating system commands are used. It may be instructional to list the buffer operations of CMS:
[DESBUF]
Removes all strings and buffers from the stack, and leaves the stack clean and empty. It is often used instead of repeated calls to DROPBUF. It always returns the value zero.
[DROPBUF]
Removes zero or more buffers from the stack. It takes one parameter which can be omitted, and which must be an integer position if specified, and is the assigned number of the bottom-most buffer to be removed, i.e. that buffer and all buffers above it (and of course, all the strings in these buffers) are to be removed. If the parameter is not specified, only the topmost buffer is removed. The return valued is always zero, unless an error occurred.
[MAKEBUF]
Makes a new buffer on the stack, starting at the current top of the stack. The return code (as stored in the special variable RC) is the number of buffers currently on the stack after the new buffer has been added. Obviously, this will be a positive integer. This program takes no parameters.
One might regard a buffer as a sort of bookmark, which is inserted into the stack, so that a subsequent DROPBUF command can remove the stack down to a particular such bookmark.
When such a mark is located on the top of the stack, and a PULL instruction is executed, the buffer mark is implicitly destroyed when the PULL instruction reads the string below the buffer mark. This is to say that a buffer can be destroyed by either a DESBUF command, a DROPBUF command, or a read from the stack (by either the PULL or PARSE PULL instructions).
Normally, data pushed on the stack is added to the top of the stack. When a stack contains only one buffer, the strings in that buffer are the strings stored above that buffer-mark. The strings below it are not part of the first buffer; instead, they are said to belong to the zeroth buffer.
Thus, all strings from the bottom of the stack, up till the first buffer mark (or the top of the stack if no buffers exist) is said to be the strings in the zeroth buffer. However, note that the zeroth buffer is only defined implicitly. Thus, it can not really be removed by calling DROP; only the strings in the zeroth buffer are removed. Afterwards, the zeroth buffer will still contain all strings at the bottom of the stack, up till the first buffer mark (if existing).
Example: Process all strings in the stack
This is a common REXX idiom, where a loop iterates over all the strings currently in the stack, but otherwise leave the stack untouched. Supposing the routine PROCESS() exists, and do to processing with its parameter and return the processed string:
do i=1 to 5 /* just to fill the stack */
push 'line #' i
end
do queued() /* foreach line in the stack */
parse pull line /* fetch the line */
queue process(line) /* put back the processed line */
end
Here, it is important to use QUEUE to put the strings back into the stack, not PUSH, else the loop will iterate the correct number of times, but only operate on the same data string. It is also important that the stack does not contain any buffers. Since QUEUE will insert into the bottom of the topmost buffer, the loop would iterate the correct number of times, but only on a part of the stack. Thus, the topmost part of the strings in the stack would be processed multiple times.
Example: How to empty the stack
The following short example shows how you can most easily empty the stack:
do i=1 to 5 /* Just to fill the stack */
push 'line #' i
end
do queued() /* For each line in the stack */
pull /* Remove the line from the stack */
end
This is trivially simple, but there are several interesting and subtle notes to make about this example. First, if the number of strings in the stack is likely to change, due to some external process, then the DO clause should perhaps better be written as:
do i=1 to 5 /* Just to file the stack */
push 'line #' i
end
do while queued()>0 /* While the stack is not empty */
pull /* Remove a line from the stack */
end
This will in general mean more work for the interpreter, as it is now required to check the number of strings in the stack for each iteration, while for the previous code fragment, the number of strings is only checked once. Another point is that this might not remove all buffers from the stack. Suppose the zeroth buffer is empty, i.e. there exists an buffer which was put on the stack when the stack was empty. This buffer is removed in any of the following situations: calling DESBUF, calling DROPBUF (sometimes), or reading a string below the buffer mark. Since there are no strings below the buffer mark, pulling a string from the stack would make the interpreter read from the keyboard, and hang the interpreter.
Thus, the only "safe" way to remove the string and buffers from the stack, without side effects, is to call DESBUF or DROPBUF. On the other hand, if you only want to make sure that there are no strings in the buffer, the method described here is more suitable, since it is far more compatible (although possibly not so efficient). But anyway, buffers are not a compatible construct, so it does not matter so much.
The description of multiple stack operations in this section, is not part of standard REXX, nor is it implemented in Regina. Thus, this section describes a de facto standard and you may find that few implementations support these operations.
Just as the operations described above let the REXX programmer use multiple buffers within one stack, there exists another set of operations which let the programmer create multiple stacks. There is really nothing fancy about this, except that a command will swap the stack the interpreter correctly uses with another stack.
To the interpreter this is really equivalent to a situation where a command empties the current stack, and sets up a new stack. When one stack is empty, and the REXX program tries to read from the stack, the request will not "overflow" to the previous stack (as requests to an empty buffer "overflows" to the previous buffer). Thus, the use of multiple stacks has even less direct impact on REXX interpreters than multiple buffers.
Here, it is instructive to list the commands operating multiple stacks that exists. This list has been taken from the MVS environment, according to [REXXSAA].
[DELSTACK]
Is used to remove the most currently stack, and make the most recent of the saved stacks the current stack. When there are no saved stacks, the current stack is emptied.
[NEWSTACK]
Creates a new stack, which becomes the current stack. The old current stack is put on the top of the list of saved stacks, and can be retrieved as the current stack by a subsequent DELSTACK.
[QBUF]
Counts the number of buffers in the current stack, and returns that number as the return value. A REXX program starting this command can retrieve this value as the special variable RC.
[QELEM]
Counts the number of strings (i.e. elements) in the current stack, and returns that value as the return value of the command. This value can be retrieved in REXX as the special variable RC. This operation is equivalent to the QUEUED() built-in function in REXX; it has been probably included for the benefit of other script languages that have less functionality than REXX.
[QSTACK]
Counts the number of stacks (including the current stack) and returns the value as the return value from the command. This number can be retrieved in REXX as the special variable RC.
One can regard multiple buffers and stacks as two ways of insulating the stack; where multiple stacks are a deeper and more insulating method than buffers. Note that each stack can contain multiple buffers, while a buffer can not contain any stacks. The term "hard buffers" has been used about multiple stacks, as opposed to normal buffers, which are sometimes called "soft buffers".
Also note that neither multiple stacks nor buffers are part of standard REXX, so you might come across implementations that support only multiple stacks, only buffers, or even none of them.
Example: Counting the number of buffers
In order to count the number of buffers on the stack, the following method can be used (Regina syntax has been used for buffer handling). This method is equivalent to the QBUF command described above.
buffers = makebuf() - 1
call dropbuf
This will store the number of buffers in the stack in the variable buffers. However, just as for the other examples using buffers, this example also suffers from the fact that buffer handling is fairly non-standard. Thus, you will have to adapt the code to whatever system you want to use.
As defined in TRL, the interface to the stack consists of the PARSE PULL, PULL, PUSH, and QUEUE instructions; and the QUEUED() built-in function.
There exists a binary interface to the stack in SAA, see the chapter on the SAA API interface. This interface consists of the RXMSQ exit handler and the QUENAME value of the RXSHV_PRIV request of the RexxVariablePool() function of the variable pool interface.
As mentioned, stacks are rarely a part of the operating system. Therefore, under most operating systems, REXX interpreters have to implement their own stacks. There are several strategies for doing this, some which are listed below.
[In the operating system.]
This is of course "the right way" to do it. However, it requires that the definition of the operating system is such that stacks are supported. Currently, only IBM mainframe-based systems support stack, together with a few other systems that have included stacks as a consequence of making REXX a main scripting language (Amiga and OS/2 come to mind).
[As a device driver.]
This is really just a variation of making the stack a part of the operating system. However, in some systems, drivers can be added very easily to the system. Drivers are often filesystem-based, in which case driver-based stack operations must operate on a file or pseudo-file. But for some systems, adding a driver requires much more profound changes, reconfiguration, and often system privileges. In all cases, drivers are likely to be very system specific.
[As a daemon.]
A "daemon" is background process that does some housekeeping service, e.g. handling mail from remote systems. Implementing a stack as a daemon is only slightly simpler than using a driver, but the main idea is the same for both approaches.
[In the interpreter.]
Using this approach, the stack is built into the interpreter as a sort of extension. This is often the simplest way, since it require very little coordination with other programs during run-time. The main problem is that the stack becomes private to the interpreter, so two interpreters can not use the same stack; not even if they are two invocations of the same interpreter.
These items are listed in the order of how closely they are coupled to the operating system: the first items are very closely, while the last items are loosely coupled. The more closely coupled the implementation of a stack is coupled to the operating system, the better is the chance that several interpreters on the same system can communicate in a compatible way, using the stack.
There is room for several hybrid solutions, based on the four fundamental approaches. For instance, a built-in stack can also act as a daemon.
Regina supports the stack as both a daemon and internal to the interpreter.
Example: Commands takes input from the stack
In the example above, the routine that is called takes its arguments from the stack. Similarly, commands to an external environment can get their arguments in the same way. Here is an example of how to do it:
queue 'anonymous' /* the username */
queue 'user@node' /* the password */
queue 'dir' /* first command */
queue 'exit' /* second command */
address command 'FTP flipper.pvv.unit.no'
Although this is very convenient in some situations, there is also considerable disadvantages with this method: There is no real interactive communication between the interpreter and the command; i.e. all input meant for the command must be set up before the command itself is invoked. Consequently, if one of the input lines to the command provokes an error, there is very little error handling facility. Commonly, such an error might start a cascade of errors, as the remaining input lines are likely to be invalid, or even be interpreted in a context different from what they were intended.
As with all commands involving the stack, it is important to push or queue the correct order.
Using this technique, a program can "fool" a command to do almost anything, by storing the correct input on the stack. However, there is a big disadvantage: Since the stack is implementation-dependent, it is not certain that a command will take its input from the stack. For some systems, this is the default, while for other systems, this is only possible through some explicit action. Some systems might not even allow commands to take their input from the stack at all.
Example: "Execing" commands
Many script programming languages can only execute commands while still running, or at most start a new command immediately after the termination (like the exec() system call in Unix). However, the stack can be used on some systems to set up the system to execute one or more commands after the current script terminates. Here is an example:
push 'ls' /* finally execute 'ls' */
push 'who' /* then execute 'who' */
push 'pwd' /* first execute 'pwd' */
exit 0
Supposing that the system reads its commands from the stack if the stack is not empty, then this script will terminate after having set up the stack so that the three commands pwd, who and ls will be run in that sequence. Note the order, if QUEUE had been used, the order would be the opposite, which is perhaps more intuitive (assuming the topmost buffer is empty).
As with the example above, this too is only relevant for some systems, thus is not very compatible, and you should be careful when using it. It also suffers from the lack of interactivity, error handling, and the importance of the order in which the strings are pushed or queued. For all practical reasons, this is just a special case.
Using the stack to "leave behind" command names and input only works for systems where command interpreters and commands reads their input from the stack. This is in general true for IBM mainframe systems, but very few other systems.
In Regina, the stack is implemented as both an integral, private part of the interpreter and as a cross-platform external stack able to be used by multiple clients on multiple machines. Internal stacks provide the obviousadvantage of speed at the expense of data sharing. External stacks are considerably slower, but do enable data sharing between instances of Regina and/or other programs.
Regina supports the standard TRL (and ANSI) REXX stack interface functionality, like PARSE PULL, PULL, QUEUE, PUSH, the QUEUED() built-in function, and in future versions, support the SAA API stack interface. These commands and functions operate on both the internal and external stacks.
Whenever the REXX programmer wants to execute a command and let that command either flush the output to the internal stack, or read its input from the internal stack, this has to be arranged by the interpreter itself. In Regina this is normally done by prepending or appending certain terms to the command to be executed.
Consider the following command clauses for Regina:
'ls >LIFO'
'who >FIFO'
'LIFO> wc'
'LIFO> sort >FIFO'
For all these commands, the "piping" terms are stripped off the command string before the command is sent to the command interpreter of the operating system. Thus, the command interpreter only sees the commands ls, who, wc, and sort. The terms stripped off, are used as indicators of how the input and output is to be coupled with the stack. The use of input/output redirection as above is only available with the internal stack.
Note that it is important not to confuse the redirection of output to the stack and input from the stack in Regina with the redirection of the Unix shells. The two can be mixed in command lines, but are still two different concepts.
The first command will execute the ls command, and redirect the output from it to the stack in a LIFO fashion. The second executes the command who and redirects the output to the stack to, but in a FIFO fashion. The third command executes the wc, but lets the standard input of that command come from the stack. Actually, it is irrelevant whether FIFO> or LIFO> is used for input; the strings are read from the top of the stack in both cases. The fourth command is a plain ps command without any redirection to or from the stack. The last command executes the sort program and lets it read its input from the stack, and redirect the output to the stack.
Regina allows a command to take both an input and an output "redirection" to a stack, as showed in the last example above. However, it also guarantees that the output is not available in the stack before the command has terminated. The output from the command is stored in a temporary stack, and flushed to the ordinary stack after the command is terminated. Thus, the command will not start to read its own output.
Note that this temporary buffering of command output is the default behavior, which might be set up to something different at your site.
In addition, you can change it through the OPTIONS instruction, by using either FLUSHSTACK or BUFFERSTACK as "parameters".
Note the difference between Regina's redirection and Unix redirection. In Regina, only the term LIFO> (when first in the command string), and the terms >LIFO and >FIFO (when last in the command string), will be interpreted as redirection directives. These terms will be stripped off the command string. All other redirection directives will be left untouched. If you should happen to need to redirect output from a Unix command to the file FIFO or LIFO, then you can append a space at the end or specify the file as ./FIFO of ./LIFO. That will make Regina ignore the redirection term.
Note that this particular form of redirection of command input and output will most probably disappear in future versions of Regina, where it will probably be replaced by an extended ADDRESS instruction.
In addition to the ANSI standard, there are a few extra built-in functions, which are supposed to provide compatibility with other REXX implementations, principally CMS REXX. These are BUFTYPE, DESBUF, DROPBUF and MAKEBUF. See the descriptions of these function in the built-in functions section above.
The implementation of the external stack follows the model used by OS/2 REXX, but is implemented as an operating system daemon. This daemon is rxstack.
Under most operating systems, rxstack is started from the operating system's startup process and terminates when the machine is shutdown. Under Windows NT/2000, it runs as a Service.
Communication between rxstack and Regina is done via TCP/IP sockets. Using sockets as the IPC mechanism on a local machine is somewhat slow compared to other mechanisms such as shared memory or named pipes. It does however enable operation between machines on different operating systems to function seamlessly.
The full syntax of the rxstack command is:
rxstack [switch]
switch is one of the following switches
-install installs the NT Service; Rexx Stack - Windows NT/2000 only
-remove removes the NT Service; Rexx -Stack - Windows NT/2000 only
-run runs rxstack in a command prompt - Windows NT/2000 only
-d run rxstack as a daemon - Unix only
-k kills (stops) rxstack - subject to being a valid killer - see Security of External Queues
To stop rxstack, the process can be killed with a SIGINT or SIGTERM or by running rxstack with the -k switch.
To allow non-REXX program to interface to the rxstack daemon, a companion program; rxqueue, is provided. rxqueue communicates with non-REXX programs via its stdin and stdout.
Consider the following equivalents for Regina's internal and external stack
'ls >LIFO' 'ls | rxqueue /lifo'
'who >FIFO' 'who | rxqueue /fifo'
'LIFO> wc' rxqueue /pull | wc'
'LIFO> sort >FIFO' rxqueue /pull | sort | rxqueue /fifo'
The full syntax of the rxqueue command is:
rxqueue [queue] [switch]
queue is a Regina external queue name – see the next section for structure. If no queue is specified, rxqueue uses the queue name; SESSION
switch is one of the following switches – as per OS/2 REXX
/fifo queue lines from stdin LIFO onto the queue
/lifo queue lines from stdin FIFO onto the queue
/clear remove all lines from the queue
the following switches are Regina extensions
/queued return the number of lines on the queue
/pull pull all lines from the queue and display on stdout
REXX programs communicate with rxstack via the normal queueing mechanisms of QUEUE, PUSH, PULL and (). These commands operate on the current queue and have no mechanism for changing the queue to use. This is where() is used. Its primary purpose is to control the queue that the remainder of the REXX program operates on.
To enable the use of the REXX stack as a cross-platform, multi-machine IPC, the naming conventions adopted by OS/2 REXX has been modified. As OS/2 REXX queues are local to a single machine, queue names have no structure. To enable identification of queues on different machines, some structure must be built into external queue names on Regina. An external queue name on Regina has the following format:
[queue][@machine[:port]]
The components of the queue name are:
queue the name of the queue. The only criteria for the name is that it contains none of the following characters: @, . or :. The queue component can be blank, when specifying the default queue on a specified machine.
machine the machine that hosts the specified queue. This can either be a standard IPv4 IP address or a machine name that can be resolved to a standard IPv4 IP address. The machine name is optional, and defaults to 127.0.0.1
port The port number that rxstack on machine is listening to. The default port number for rxstack is 5757.
When referring to queues on the local machine, the machine and port components need not be specified. The behaviour of the external stack is then the same as for OS/2 REXX, with the exception that the queues on the local machine can still be manipulated by Regina on another machine.
Some examples may make this clearer. TBD
(Not implemented yet)
A daemon process like rxstack, waiting on a TCP/IP socket for anyone to connect to and use is open to abuse. To reduce the openness of rxstack, it uses a security mechanism much like the Unix hosts.allow and hosts.deny files is used to control access to rxstack.
RXQUEUE
RXSTACK
Table of Contents
The Stack 1
1Background and history 1
2General functionality of the stack 1
2.1Basic functionality 1
2.2LIFO and FIFO stack operations 3
2.3Using multiple buffers in the stack 4
2.4The zeroth buffer 5
2.5Creating new stacks 6
3The interface between REXX and the stack 7
4Strategies for implementing stacks 8
5Implementations of the stack in Regina 10
5.1Implementation of the internal stack in Regina 2.2 10
5.2Implementation of the external stack in Regina 2.2 11