-
Notifications
You must be signed in to change notification settings - Fork 66
STM8 eForth Word List Extensions
At the Convention of the German Forth Society 2018, Manfred Mahlow gave the talk Forth WORDLISTs im Flash were he presented a lightweight Word List implementation using wid-tags in Flash memory for Mecrisp-Stellaris and STM8 eForth. According to the author "word lists" in ROM were inspired by noForth, a MSP430 oriented µC-Forth.
This wiki page provides an overview of the implementation for STM8 eForth. From STM8 eForth release 2.2.21 on the Word List feature is delivered as a set of words in the lib
folder that can be loaded with e4thcom or with codeload.py.
The word CURRENT
builds the infrastructure for "word list" extensions. With the help of the e4thcom (or codeload.py) dependency management feature #require
, CURRENT
patches and extends an existing STM8 eForth binary (note that patch-points are exported by the STM8 eForth kernel and test automation in STM8 eForth ensures the integrity of the patch).
Traditionally contexts / word lists / namespaces are implemented in Forth as linked lists. This is memory efficient and easy to do for traditional Forth systems that keep the dictionary in RAM. For small embedded Forth systems that keep a unified dictionary in Flash this isn't the best option. The single linked is easier to implement in a Flash based embedded Forth systems since adding and forgetting words doesn't require freeing up memory inside a list.
Instead, the CURRENT
patch adds a "word list identifier" (the wid
-tag), that indicates the word list a dictionary entry belongs to.
In order to keep compatibility with the dictionary in the ROM a flag in the word header (akin to the IMMEDIATE
or the COMPONLY
flags) indicates that words have a wid
field. Like traditional Forth systems the new "word list" implementation has a default word list called FORTH
. Words without a wid
field belong to the FORTH word lis. Likewise, the wid
field value 0 (zero) stands for the FORTH
word list. Any other unique non-zero number can be used as a wid
-tag.
The Word List feature extends the dictionary with an wid (Word List Identifier) and a tag-bit in the length-encoding header byte in the following way:
tag-bit t=0 (untagged)
|Link|i,c,t,Length|Name|Body|
tag-bit t=1 (tagged)
|wid|Link|i,c,t,Length|Name|Body|
i: IMMED, c: COMPO, t:TAGGE
The lexicon mask constants, used e.g. in find
, now have the following values:
ID | Mask | Comment |
---|---|---|
IMEDD | 0x80 | lexicon immediate bit |
COMPO | 0x40 | lexicon compile only bit |
TAGGE | 0x20 | lexicon tag bit (previously unused in STM8 eForth) |
MASKK | 0x1F7F | lexicon bit mask |
Technically "tagging a word" means "compile the wid-id
field in front of the word's link-field and set the lexicon tag bit (TAGGE = 0x20
) in the count
byte at the word's name address".
Loading the CURRENT
library file with #require
does the following:
- define the variable
CURRENT
that holds the wid for new definitions ("0" for word listFORTH
) - extend
$,n
to add a wid field and set the tag-bit of a new dictionary entry, unlessCURRENT
equals 0 - redefine
CONTEXT
as an array - extend
find
andWORDS
with tag-bit evaluation - make
NAME?
respect a search order of tagged and untagged words - make
UNIQUE?
search only in compilation context
This extension adds 310 bytes to the Flash size, and requires 6 bytes of RAM. Applying CURRENT
patches the existing binary and then applies PERSIST
to protect the changes (otherwise a RESET
would render the patched Forth kernel unusable).
The optional library word FORTH
sets the variable CONTEXT
to 0 and thus provides a functionality well known from traditional Forth systems.
Manfred Mahlow implemented 3 different flavors of using word lists in STM8 eForth: basic WORDLIST
, traditional Forth-83 style VOCABULARY
, and the novelty VOC
.
WORDLIST
applies word list identifiers by implementing a direct manipulation of CURRENT
and CONTEXT
.
The following example session demonstrates this feature:
#require WORDLIST
WORDLIST CONSTANT wid0 ok
wid0 CURRENT ! ok
CURRENT ? -27611 ok
: .S ( -- ) ." This is .S in the word list wid0 " .S ; ok
words
wid0 WORDLIST ?RAM CURRENT CONTEXT find IRET SAVEC ...
.S
<sp ok
The new word WORDLIST
creates an unique ID (wid) for a word list (i.e. the address of a byte in the Flash dictionary), stored as constant wid0
, which gets assigned to CURRENT
. The following re-definition of .S
uses that wid, to the effect that the new definition remains invisible (WORD
) and can't be found (.S
still uses the original code).
Assigning wid0
to `CONTEXT changes that:
wid0 CONTEXT ! ok
.S This is .S in the word list wid0
<sp ok
WORDS
.S ok
The new word list is now on top of the dictionary search order and the new .S
is visible and hides the word .S
in
the FORTH core. WORDS
always only presents words in the CONTEXT
word list, so the new .S
is the only thing it shows here.
The new definition of .S
is now visible and active (but it's the only thing that WORDS
shows).
By resetting CONTEXT
to 0 (that's the FORTH
context), the original search order is restored.
0 CONTEXT !
WORDS
wid0 WORDLIST ?RAM CURRENT CONTEXT find IRET ...
The traditional method emulates the Forth-79 words VOCABULARY
, DEFINITIONS
and ORDER
.
Installation of VOCABULARY
on a fresh STM8 eForth binary also brings some helper words:
#require VOCABULARY
\ ...
WORDS
ORDER .VOC NVM VOCABULARY DEFINITIONS FORTH WIPE CURRENT CONTEXT find IRET ...
The following words are provided:
- a new
NVM
prevents compiling code to flash while vocabulary in RAM is active (CURRENT
> 0) -
ORDER
presents the search order for vocabularies defined byCONTEXT
-
.VOC
shows the name of a vocabulary, e.g.CURRENT @ .VOC
.
The following session demonstrates the VOCABULARY feature:
VOCABULARY myvoc ok
myvoc ok
words
ok
ORDER myvoc FORTH ok
DEFINITIONS
: .S ( -- ) ." This is .S in myvoc " .S ; ok
words
.S ok
.S This is .S in myvoc
<sp ok
FORTH ok
.S
<sp ok
ORDER FORTH FORTH ok
Note that VOCABULARY
, unlike the basic WORDLIST
feature, can also be used with temporary vocabularies in RAM.
VOC
brings the idea of "namespace prefixes" to embedded Forth. The rationale is to keep the vocabulary uncluttered and the source code readable, while using short and targeted words (e.g. keep using C!
for writing a byte to a serial EEPROM).
It's even possible to create a hierarchy of namespaces by defining a VOC
prefix within another "namespace":
#require i2c
i2c DEFINITIONS
VOC eeprom i2c eeprom DEFINITIONS
$50 CONSTANT sid
: C@ ( a -- c ) 1 i2c eeprom sid i2c read ;
: C! ( c a -- ) 2 i2c eeprom sid i2c write ;
FORTH DEFINITIONS
The prefixes i2c
and eeprom
are IMMEDIATE
(i.e. executed while compiling). In the definition of C!
above, the literals 2
and sid
, and the word write
get compiled to the Flash ROM (NVM). By defining different prefixes an I2C library can have "branches" and "leaves" for different I2C devices (e.g. i2c rtc
, i2c port
, etc).
The new words C@ and C! can now be used like this:
$AA 200 i2c eeprom C!
200 i2c eeprom C@ .
The VOC
front-end and the generic CURRENT
infrastructure require a total of 599 bytes Flash and 8 bytes RAM memory. For applications in STM8 "Low density" devices this overhead might be too high, but for STM8 "Medium" or "High density" devices with 32K Flash ROM) please go ahead and run #require VOC
in e4thcom after flashing the image - it's really worth it!