Chapter 22: Multiple Word Lists

Without the Search-Order word set, all words except for locals and environment query strings belong to the same word list. FORTH-WORDLIST returns the identifier of this word list. As a consequence, SEARCH-ALL searches words only in this word list. For locals and environment query strings, the dedicated search words SEARCH-LOCAL and ENVIRONMENT? can be used, respectively. All search words are based on SEARCH, which searches a word in a given word list. SEARCH is StrongForth's version of the ANS Forth word SEARCH-WORDLIST (see chapter 8).

Search Order

If multiple word lists are available, a search order specifies which word lists are to be searched and in which order. In StrongForth, the search order is determined by a list of word list identifiers, which has been allocated in the data space:

9 CONSTANT #VOCS
1 VARIABLE #ORDER

FORTH-WORDLIST VARIABLE CONTEXT #VOCS 1- CELLS DATA-SPACE ALLOT

#VOCS is the maximum size of the list. No more than 9 word list identifiers can be stored in the list. #ORDER is a variable containing the actual number of word lists in the search order, which may be less than the maximum.

The existence of multiple word sets makes it necessary to update the word WORDS from the Programming-Tools word set, because so far this word always displays the words in the FORTH-WORDLIST word list. With the updated definition, WORDS displays the words in the word list on top of the search order:

: WORDS ( -- )
  CONTEXT @ (WORDS) ;

ANS Forth specifies the word GET-ORDER, which returns the search order in the following format:

( -- widn ... wid1 n )

This is an ambiguous stack diagram, because the number of word lists cannot be determined at compile time. Since ambiguous stack diagrams are not allowed in StrongForth, an alternative solution has to be found. And it's pretty obvious how this solution looks like, because a very similar problem has already been solved in connection with the implementation of SAVE-INPUT and RESTORE-INPUT (see chapter 11). Instead of a variable number of word lists and a count, StrongForth's version of GET-ORDER simply returns a tuple:

: GET-ORDER ( -- TUPLE -> WID )
  NEW-TUPLE -> WID #ORDER @
  IF CONTEXT DUP #ORDER @ + 1- DO I @ >T -1 +LOOP
  THEN ;

Note that the loop in the definition of GET-ORDER iterates from the last word list identifier to the first one. The first word list in the search order, which is stored at the first location of CONTEXT, has to be added as the last component of the tuple, because accessing the items of a tuple works like accessing the items on a stack. In contrast to GET-ORDER, the loop in the definition of SET-ORDER iterates in forward direction:

: SET-ORDER ( TUPLE -> WID -- )
  SIZE #VOCS >
  IF -49 THROW
  ELSE SIZE DUP #ORDER ! CONTEXT SWAP + CONTEXT ?DO T> I ! LOOP
  THEN DROP ;

SET-ORDER throws an exception if an attempt is made to store a search order from a tuple whose size is greater that the size of CONTEXT.

ANS Forth specifies that SET-ORDER shall restore the default search order if it is executed with -1 as the count. Of course, this is not possible with the StrongForth version of SET-ORDER, because tuples always have a non-negative size. For this special case, StrongForth provides an overloaded version of SET-ORDER that does not expect any input parameter at all. It establishes a search order consisting of only the FORTH-WORDLIST word list. Note that this special version has to be defined before the general version, because otherwise it would hide the general version in the dictionary.

: SET-ORDER ( -- )
  FORTH-WORDLIST CONTEXT ! 1 #ORDER ! ;

With GET-ORDER, the updated version of SEARCH-ALL can easily be implemented. The new version searches all word lists in the search order instead of just searching the FORTH-WORDLIST word list. Since SEARCH-ALL is a deferred definition, all words using it will immediately incorporate the extended semantics.

:NONAME ( CDATA -> CHARACTER UNSIGNED SINGLE CODE -- DEFINITION SIGNED )
  LOCALS| C S U A | GET-ORDER
  BEGIN SIZE
  WHILE T> >R A U S C R> SEARCH DUP 0=
  WHILE DROP DROP
  REPEAT >R >R DROP R> R>
  ELSE DROP NULL DEFINITION +0
  THEN ; IS SEARCH-ALL

Vocabularies

Since each word is linked to the previous one, the complete word list can be uniquely identified by a pointer to the latest word. If the pointer is stored as a double-cell variable in the data space, the word list identifier could be the address of this variable. That's a feasible solution. But since each word list is completely located either in the name space or in the local name space, it is not necessary to store the address segment in the data space and thus waste RAM space, which is pretty precious in embedded systems. It is sufficient to store only the address offset in the data space. This means, the word list identifier becomes a pointer to a data structure in the constant data space, which consists of a pointer to the offset and the memory space. Each memory space is associated with a unique segment. So here's the memory image of a word list, without the actual words in the name space or the local name space:

Constant data space:
pointer to data space
memory space

Data space:
pointer to name field of latest word

With the following word, which is included in the StrongForth Programming-Tools word set, you can determine the most recently created definition of a specific word list:

: LAST ( WID -- DEFINITION )
  CAST CONST -> DATA -> ADDRESS DUP @ @ DUP
  IF SWAP 1+ CAST CONST -> CODE -> ADDRESS @ 1+ @ MERGE
     CAST CFAR-ADDRESS -> UNSIGNED NAME>DEFINITION
  ELSE DROP DROP NULL DEFINITION
  THEN ;

The definition of LAST looks pretty awkward, because it needs three type casts. First, it calculates the address offset of the definition's name field from the word list identifier. If this offset is zero, the word list is empty and LAST simply returns a null definition. Otherwise, LAST fetches the address segment of the memory space the word list is stored in. Then, segment and offset are merged, and the resulting name field address is converted to a definition.

LAST is included in the Programming-Tools word set, because it is used by WORDS, which iterates through all the definitions of a word list starting at the most recently created word. Another word that uses LAST is also included in the Programming-Tools word set. FIRST returns the first definition of a word list by moving through the complete word list until it finds the word whose link field contains zero:

: FIRST ( WID -- DEFINITION )
  LAST DUP 0= INVERT
  IF BEGIN DUP PREV DUP 0= IF DROP EXIT THEN NIP
     AGAIN
  THEN ;

WORDLIST creates a new word list in the name space by compiling the memory image in the constant data space and in the data space. The address of the latest word is initialized with zero, because the new word list is still empty:

: WORDLIST ( -- WID )
  CONST-HERE SPACE@ DATA-SPACE HERE CONST, NULL ADDRESS ,
  NAME-SPACE SPACE@ CONST, SPACE! CAST WID ;

ANS Forth specifies the word FORTH as a Search-Order extension word for replacing the first word list in the search order with the FORTH-WORDLIST word list. With

: DO-VOCABULARY ( -- )
  DOES> ( CONST -> WID -- )
  #ORDER @ 0= IF -50 THROW THEN @ CONTEXT ! ;

and

NULL DEFINITION VARIABLE VOC-LINK

FORTH and two additional words for the other two predefined word lists, LOCAL-WORDLIST and ENVIRONMENT-WORDLIST, can easily be defined:

CONST-SPACE

CREATE FORTH FORTH-WORDLIST , VOC-LINK @ ,
DO-VOCABULARY LATEST VOC-LINK !

CREATE LOCAL LOCAL-WORDLIST , VOC-LINK @ ,
DO-VOCABULARY LATEST VOC-LINK !

CREATE ENVIRONMENT ENVIRONMENT-WORDLIST , VOC-LINK @ ,
DO-VOCABULARY LATEST VOC-LINK !

FORTH, LOCAL and ENVIRONMENT are called vocabularies. The data field of each vocabulary contains the identifier of the respective word list, plus a pointer to the previous vocabulary. The pointer of the most recently defined vocabulary is stored in the variable VOC-LINK. Linking all vocabularies makes it possible to find the vocabulary that belongs to a given word list. This is required for displaying the name of a vocabulary. With SEARCH and SEARCH-ALL, a vocabulary can only be found if the word list the vocabulary belongs to is included in the search order.

Additional word lists and corresponding vocabularies can be defined with VOCABULARY:

: VOCABULARY ( -- )
  WORDLIST CREATE CONST, VOC-LINK @ CONST, LATEST VOC-LINK !
  DO-VOCABULARY ;

Note that VOCABULARY does not define a word that returns the word list itself, like FORTH-WORDLIST, LOCAL-WORDLIST, and ENVIRONMENT-WORDLIST.

The Search-Order word set includes the definitions of two additional empty vocabularies:

VOCABULARY ASSEMBLER
VOCABULARY EDITOR

In the ANS Forth specification, these two vocabularies are included in the Programming-Tools word set, because the assembler and the editor are tools. However, since the definitions of the two vocabularies are meaningless without the Search-Order word set being available, StrongForth includes them in the Search-Order word set. If you want the assembler and editor definitions to reside in their own vocabularies, you have to load the Search-Order word set prior to loading the assembler and the editor.

Finally, here's the definition of the ANS Forth word DEFINITIONS, which makes the first word list of the search order the current compilation word list:

: DEFINITIONS ( -- )
  CONTEXT @ SET-CURRENT ;

Extension Words

The words presented so far provide a low-level frame work for handling the search order. ANS Forth specifies a proposal for the actual user interface as a collection of extension words. These are the corresponding definitions in StrongForth:

: ALSO ( -- )
  GET-ORDER T> DUP >R >T R> >T SET-ORDER ;

: ONLY ( -- )
  SET-ORDER ;

: PREVIOUS ( -- )
  GET-ORDER T> DROP SET-ORDER ;

: ORDER ( -- )
  ." CURRENT: " GET-CURRENT . CR
  ." CONTEXT: " GET-ORDER BEGIN SIZE WHILE T> . REPEAT DROP ;

Remember that GET-ORDER returns the search order as a tuple of word lists, and SET-ORDER stores a given tuple of word lists as the new search order. SET-ORDER without a parameter initializes the search order to its default.

The definition of ORDER uses an overloaded version of . for items of data type WID. . in turn uses an overloaded version of ?DEFINITION to find the vocabulary belonging to a word list by iterating through the linked list of vocabularies. If ?DEFINITION does not find a vocabulary that is associated with the given word list, it returns a null definition. In this case, . displays three question marks instead of a vocabulary name:

: ?DEFINITION ( WID -- DEFINITION )
  VOC-LINK @
  BEGIN OVER OVER >BODY -> WID @ <>
  WHILE >BODY 1 CELLS + -> DEFINITION @ DUP 0=
  UNTIL
  THEN NIP ;

: . ( WID -- )
  ?DEFINITION DUP 0=
  IF DROP ." ??? " ELSE NAME TYPE SPACE THEN ;

Updating MARKER

With the introduction of multiple word lists and a search order, the definition of MARKER needs to be extended. It is no longer sufficient just to save and restore the data space pointers, the current memory space, the pointer to the latest definition, and the pointers to the latest definitions in the three predefined word lists. The case has to be considered that a new word list and a new vocabulary are being created after the definition of a marker. The word list and the vocabulary will vanish from the dictionary when the marker is executed. This means that all references to them in the search order, the current compilation word list and the vocabulary link have to be removed as well. Furthermore, the pointers to the latest definitions of all word list that existed prior to the marker have to be restored. The structure to be saved by MARKER has grown:

constant data space pointer
data space pointer
code space pointer
name space pointer
local name space pointer
memory space
LATEST
#ORDER

CONTEXT

CURRENT
VOC-LINK

word list pointers

This structure is being created in the constant data space by a new word called (MARKER,):

: (MARKER,) ( -- )
  NULL DATA -> SINGLE DUP 8 + SWAP
  DO I @ CONST, LOOP
  #ORDER @ CONST,
  CONTEXT DUP #ORDER @ + SWAP
  DO I @ CONST, LOOP
  GET-CURRENT CONST,
  VOC-LINK @ DUP CONST,
  BEGIN >BODY -> CONST -> DATA -> ADDRESS DUP @ @ @ CONST,
     1+ CAST CONST -> DEFINITION @ DUP 0=
  UNTIL DROP ;
  
: MARKER ( -- )
  CONST-HERE (MARKER,)
  CREATE CONST, DOES> ( CONST -> CONST -> SINGLE -- )
  @ DUP NULL DATA -> SINGLE 8 MOVE 8 +
  CAST CONST -> UNSIGNED DUP @ #ORDER ! 1+
  CAST CONST -> WID DUP CONTEXT #ORDER @ MOVE #ORDER @ +
  DUP @ DUP GET-CURRENT = IF DROP ELSE SET-CURRENT THEN 1+
  CAST CONST -> DEFINITION DUP @ VOC-LINK ! 1+
  CAST CONST -> ADDRESS VOC-LINK @
  BEGIN >BODY -> CONST -> DATA -> ADDRESS DUP @ @ ROT DUP @
     ROT ! 1+ SWAP 1+ CAST CONST -> DEFINITION @ DUP 0=
  UNTIL DROP DROP ;

The extended version of MARKER uses (MARKER,) to save the structure during the definition of a marker. In the runtime part of MARKER, the various pointers are restored from the structure. Since the size of the structure is not constant, a pointer to the beginning of the structure has to be stored in the data field of the marker. Note that the structure itself has to be created prior to the definition of the marker, because the marker shall restore the state as is was before its own definition, i. e., it shall destroy itself.


Dr. Stephan Becher - February 8th, 2008