Chapter 4: Memory Spaces

What's in a Memory Space?

The ANS Forth specification mentions three different memory spaces: name space, code space and data space. In order to allow various existing Forth systems to be ANS Forth compliant, the semantics of these memory spaces are supposed to be implementation dependent.

Memory spaces should be clearly distinguished from memory areas, which were introduced in chapter 2. Memory areas are defined by the hardware of the system and the addressing modes of the underlying system. For example, the DATA memory area is always located in RAM, while the CODE and CONST memory areas may also be in ROM. CODE and CONST can be distinguished by the access mode of the processor, because CODE is used for fetching machine instructions and CONST for reading constant data.

Memory spaces, on the other hand, are defined by the software. Each memory space is a part of a specific memory area, with no overlapping. StrongForth provides five memory spaces. The following table shows in which memory area each of these five memory spaces is located.

Memory space Memory area Contents
Name space - Name and stack diagram of words
Code space CODE Executable machine code
Data space DATA Application and system variables, stacks, buffers
Constant data space CONST Application and system constants, virtual machine code
Local name space DATA Locals and other data structures used during compilation

Name Space

Each word contained in the StrongForth dictionary consists of several components, that are distributed over different memory spaces. The name space contains those components that are required by the StrongForth interpreter to find and execute or compile a word:

Note that the stack diagrams have to be stored in the dictionary, because otherwise it would be impossible to distinguish overloaded words sharing the same name. Details about the structure of words will be given in chapter 8.

The name space is not located in one of the predefined memory areas DATA, CONST or CODE. You can assume that it is located in a kind of unnamed memory area. This means, on systems with banking or segmentation, the contents of the name space can only be accessed by using full addresses of data types FAR-ADDRESS or CFAR-ADDRESS.

Code Space

As expected, the code space is located in the CODE memory area. It mostly contains executable machine code, which is used by the inner interpreter and by words that are explicitly defined as machine code words, like DUP, @, and +.

Data Space

The data space contains all read/write data required by the system and by applications. Since the DATA memory area is the only one that is always located in RAM, the data space must be located in this area. For example, variables and values are always allocated in the data space.

Constant Data Space

Since constant data structures do not require read/write access, they may be stored in the CONST memory area. The constant data space is the preferred location for data that remains unchanged by applications. This includes

Something like the constant data space is not specified in ANS Forth. In StrongForth, it provides a means to store data in ROM instead of RAM. In embedded systems, the amount of RAM is often very limited, so it normally becomes a bottleneck. Therefore, it's a good idea to store constant data in ROM, which is usually not as scarce as RAM. Virtual machine code, for example, is usually generated at compilation time, and will not be modified during runtime.

Local Name Space

Name space, code space, data space and constant data space are organised as heaps that grow at compile time. However, some data structures that are created during compilation of a word become obsolete after the compilation is finished. These data structures include definitions of locals and status information for program loops and conditional clauses. The local name space is the preferred location for data structures that may be discarded after the compilation of a word. It is located in the DATA memory area.

Selecting a Memory Space

One of the five memory spaces is always the current memory space. The following words, which are described in detail in the next section, are always applied to the current memory space:

To select one of the five memory spaces as the current memory space, you may use one of these words:

NAME-SPACE ( -- )
CODE-SPACE ( -- )
DATA-SPACE ( -- )
CONST-SPACE ( -- )
LOCAL-SPACE ( -- )

Let's try it out by printing the unused memory in the data space and in the constant data space:

DATA-SPACE UNUSED . CONST-SPACE UNUSED .
61800 49106  OK

Now, the constant data space is the current memory space. Any of the above mentioned words will be applied to the constant memory space, until another memory space is selected.

Sometimes, it's necessary to change the memory space only temporarily to perform a certain operation, and then turn back to the previously selected memory space. For this purpose, the current memory space can be saved and restored by two special words:

SPACE@ ( -- MEMORY-SPACE )
SPACE! ( MEMORY-SPACE -- )

SPACE@ returns the an abstract identified of the current memory space, which has the data type MEMORY-SPACE. Using this identifier, SPACE! can be used to restore the current memory space to what it was before SPACE@ was executed. Here's a typical application:

: CONST-UNUSED ( -- UNSIGNED )
  SPACE@ CONST-SPACE UNUSED SWAP SPACE! ;
 OK
DATA-SPACE CONST-UNUSED .
49092  OK

CONST-UNUSED returns the amount of unused space in the constant data space, independently of the current memory space. Furthermore, CONST-UNUSED does not change the current memory space, because it restores the current memory space after executing UNUSED.

Operations on Memory Spaces

ANS Forth specifies a set of words that perform operations on the memory space. In StrongForth, these operations are always applied to the current memory space:

HERE ( -- ADDRESS )
ALIGN ( -- )
, ( DOUBLE -- )
, ( SINGLE -- )
C, ( SINGLE -- )
ALLOT ( INTEGER -- )
UNUSED ( -- UNSIGNED )

HERE returns the first unused address of the current memory space, which is usually referred to as the dictionary pointer. The data type of this address is just ADDRESS, because the current memory space is generally not known at compile time. The compiler can not determine whether it is a DATA, a CODE or a CONST address. You as the programmer, however, should know which one is the current memory space at runtime. In most cases, HERE will be immediately succeeded by a type cast to an address within a specific memory area.

StrongForth also provides three special versions of HERE, which return the first unused address of the data space, the constant data space and the code space, respectively:

DATA-HERE ( -- DATA )
CONST-HERE ( -- CONST )
CODE-HERE ( -- CODE )

On systems with banking or segmentation, it's often necessary to have a version of HERE that returns the full address:

FAR-HERE ( -- FAR-ADDRESS )

FAR-HERE must be used if the name space is the current memory space, because the name space is not located in one of the predefined memory areas and can only be accessed with full addresses. Furthermore, FAR-HERE has to be used in all cases where the address must stay independent of the current memory space, for example because the current memory space is unknown at compile time even to the programmer.

ALIGN works exactly the same way as specified in ANS Forth. Note that ALIGN always applies to the current memory space.

Although ALIGNED has nothing to do with memory spaces, it is presented here because of it's close relationship to ALIGN. ALIGNED is available for all items of data types ADDRESS and FAR-ADDRESS, and their respective subtypes:

ALIGNED ( ADDRESS -- 1ST )
ALIGNED ( FAR-ADDRESS -- 1ST )

, and C, also work exactly like the respective ANS Forth versions. They can both be applied to items of data type SINGLE and it's subtypes. For items of data type DOUBLE and it's subtypes, an additional overloaded version of , is provided. C, can not be applied to double-cell items. , and C, both apply to the current memory space. To compile single-cell and double-cell items directly into the constant data space, without the need to temporarily change the current memory space, the following two words can be used:

CONST, ( SINGLE -- )
CONST, ( DOUBLE -- )

Just like HERE, ALLOT is a low-level word in the sense that it doesn't consider StrongForth's type system. Since ALLOT allocates address units, it's you who has to take care about the size of the items you want to reserve space for. The preferred way to do this is by using CHARS and CELLS.

7 CHARS ALLOT ALIGN

allocates space for 7 character size items in the current memory space and then realigns the memory space.

5 2* CELLS ALLOT

allocates space for 5 double-cell items, without the need to realign the memory space.

As specified by ANS Forth, CHARS and CELLS accept any single number and convert it to the number of address units as required for ALLOT:

CHARS ( INTEGER -- 1ST )
CELLS ( INTEGER -- 1ST )

Finally, UNUSED returns the amount of free memory in the current memory space in address units. This is an unsigned number, because the amount of free memory can never be negative.


Dr. Stephan Becher - February 4th, 2008