-
Notifications
You must be signed in to change notification settings - Fork 66
STM8 eForth Word List Extensions
In the talk Forth WORDLISTs im Flash at the Convention of the German Forth Society 2018, Manfred Mahlow presented a lightweight Word List implementation using wid-tags in Flash memory for the embedded µC Forth systems Mecrisp-Stellaris and STM8 eForth. Manfred Mahlow states that word lists in ROM were inspired by noForth, a MSP430 oriented µC-Forth.
This wiki page provides an overview of the implementation for STM8 eForth. From STM8 eForth release 2.2.21 on, the Word List feature is delivered as a set of words in the lib
folder.
The most important of these words is CURRENT
. With the help of the e4thcom (or codeload.py) dependency management feature #require
, CURRENT
patches and extends an existing STM8 eForth binary.
Traditionally contexts / word lists / namespaces are implemented in Forth as linked lists. This is memory efficient and easy to do for traditional Forth systems that keep the dictionary in RAM. For embedded Forth systems, that maintain a unified dictionary list in Flash,this isn't a good option.
Like traditional Forth systems, the new implementation still has a default word list called FORTH
. The words of the FORTH word list are not tagged. All other words are tagged with the word list identifier(wid) of the word list they belong to.
Any unique non-zero number can be used as a tag (wid-tag). Zero is reserved for the FORTH
word list.
This is easy to implement in Flash based embedded Forth systems since adding and forgetting words doesn't require any internal garbage handling.
The Word List feature extends the dictionary with an wid (Word List Identifier) and a tag-bit in the length-encoding header byte in the following way:
tag-bit t=0 (untagged)
|Link|i,c,t,Length|Name|Body|
tag-bit t=1 (tagged)
|wid|Link|i,c,t,Length|Name|Body|
i: IMMED, c: COMPO, t:TAGGE
The lexicon mask constants, used e.g. in find
, now have the following values:
ID | Mask | Comment |
---|---|---|
IMEDD | 0x80 | lexicon immediate bit |
COMPO | 0x40 | lexicon compile only bit |
TAGGE | 0x20 | lexicon tag bit (previously unused in STM8 eForth) |
MASKK | 0x1F7F | lexicon bit mask |
Technically "tagging a word" means "compile the wid id in front of the word's link-field and set the lexicon tag bit (TAGGE = 0x20
) in the count
byte at the word's name address".
Loading the CURRENT
library file with #require
does the following:
- define the variable
CURRENT
that holds the wid for new definitions ("0" for word listFORTH
) - extend
$,n
to add a wid field and set the tag-bit of a new dictionary entry, unlessCURRENT
equals 0 - redefine
CONTEXT
as an array - extend
find
andWORDS
with tag-bit evaluation - make
NAME?
respect a search order of tagged and untagged words - make
UNIQUE?
search only in compilation context
This extension adds 310 bytes to the Flash size, and requires 6 bytes of RAM. Since applying CURRENT
patches the existing binary and applies PERSIST
to protect the changes (otherwise a RESET
would render the patched Forth kernel unusable).
The optional library word FORTH
sets the variable CONTEXT
to 0 and thus provides a functionality well known from many other Forth systems.
Manfred Mahlow implemented 3 different flavors for using word lists in STM8 eForth: basic WORDLIST
, traditional Forth-83 style VOCABULARY
, and the novelty VOC
.
WORDLIST
implements the creation of word list identifiers (wid) to be used for the direct manipulation of
CURRENT
and CONTEXT
.
The following example session demonstrates this feature:
#require WORDLIST
WORDLIST CONSTANT wid0 ok
wid0 CURRENT ! ok
CURRENT ? -27611 ok
: .S ( -- ) ." This is .S in the word list wid0 " .S ; ok
words
wid0 WORDLIST ?RAM CURRENT CONTEXT find IRET SAVEC ...
.S
<sp ok
The new word WORDLIST
creates an unique ID (wid) for a word list (i.e. the address of a byte in the Flash dictionary), stored as constant wid0
, which gets assigned to CURRENT
. The following re-definition of .S
uses that wid, to the effect that the new definition remains invisible (WORD
) and can't be found (.S
still uses the original code).
Assigning wid0
to `CONTEXT changes that:
wid0 CONTEXT ! ok
.S This is .S in the word list wid0
<sp ok
WORDS
.S ok
The new word list is now on top of the dictionary search order and the new .S
is visible and hides .S
of
the FORTH core. (WORDS
always only shows the words of the CONTEXT
word list, so the new .S
is the
only thing it shows here).
The new definition of .S
is now visible and active (but it's the only thing that WORDS
shows).
By resetting CONTEXT
to 0 (that's the FORTH
context), the original search order is restored.
0 CONTEXT !
WORDS
wid0 WORDLIST ?RAM CURRENT CONTEXT find IRET ...
The traditional method emulates the Forth-79 words VOCABULARY
, DEFINITIONS
and ORDER
.
Installation of VOCABULARY
also brings some helper words:
#require CURRENT
#require VOCABULARY
\ ...
WORDS
ORDER .VOC NVM VOCABULARY DEFINITIONS FORTH WIPE CURRENT CONTEXT find IRET ...
-
NVM
prevents compiling code to flash while vocabulary in RAM is active (CURRENT
> 0) -
ORDER
presents the search order for vocabularies defined byCONTEXT
-
.VOC
shows the name of a vocabulary, e.g.CURRENT @ .VOC
.
The following session demonstrates the VOCABULARY feature:
VOCABULARY myvoc ok
myvoc ok
words
ok
ORDER myvoc FORTH ok
DEFINITIONS
: .S ( -- ) ." This is .S in myvoc " .S ; ok
words
.S ok
.S This is .S in myvoc
<sp ok
FORTH ok
.S
<sp ok
ORDER FORTH FORTH ok
Note that VOCABULARY
, unlike the basic WORDLIST
feature, also works for temporary vocabularies in RAM.
VOC
brings the idea of "namespace prefixes" to embedded Forth. The rationale is to keep the vocabulary uncluttered and the source code readable, while using short and targeted words (e.g. C!
for writing to a serial EEPROM).
It's even possible to create a hierarchy of namespaces (i.e. by defining a VOC
prefix within another "namespace":
#require i2c
i2c DEFINITIONS
VOC eeprom i2c eeprom DEFINITIONS
$50 CONSTANT sid
: C@ ( a -- c ) 1 i2c eeprom sid i2c read ;
: C! ( c a -- ) 2 i2c eeprom sid i2c write ;
FORTH DEFINITIONS
The prefixes i2c
and eeprom
are IMMEDIATE
(executed during compile time). In the definition of C!
only the literals 2
and sid
, and the word write
get compiled to the NVM. In a similar way, an I2C library might have "branches" and "leafs" for different types of I2C devices and their features (e.g. i2c rtc
, i2c port
, etc).
The new words C@ and C! can now be used like this:
$AA 200 i2c eeprom C!
200 i2c eeprom C@ .
The CURRENT
and the VOC
front-end require a total of 599 bytes Flash and 8 bytes RAM memory. Tthat's often too much for a Low Density device, but on a STM8 Medium Density or High Density device (16K .. 32K Flash ROM) it's really worth it!