Skip to content

STM8 eForth Word List Extensions

Thomas edited this page Dec 22, 2018 · 12 revisions

Word List Extensions for STM8 eForth

In the talk Forth WORDLISTs im Flash at the Convention of the German Forth Society 2018, Manfred Mahlow presented a lightweight Word List implementation using wid-tags in Flash memory for the embedded µC Forth systems Mecrisp-Stellaris and STM8 eForth. Manfred Mahlow states that word lists in ROM were inspired by noForth, a MSP430 oriented µC-Forth.

This wiki page provides an overview of the implementation for STM8 eForth. From STM8 eForth release 2.2.21 on, the Word List feature is delivered as a set of words in the lib folder.

Word List Kernel Changes

The most important of these words is CURRENT. With the help of the e4thcom (or codeload.py) dependency management feature #require, CURRENT patches and extends an existing STM8 eForth binary.

Traditionally contexts / word lists / namespaces are implemented in Forth as linked lists. This is memory efficient and easy to do for traditional Forth systems that keep the dictionary in RAM. For embedded Forth systems, that maintain a unified dictionary list in Flash,this isn't a good option.

Like traditional Forth systems, the new implementation still has a default word list called FORTH. The words of the FORTH word list are not tagged. All other words are tagged with the word list identifier(wid) of the word list they belong to.

Any unique non-zero number can be used as a tag (wid-tag). Zero is reserved for the FORTH word list.

This is easy to implement in Flash based embedded Forth systems since adding and forgetting words doesn't require any internal garbage handling.

The Word List feature extends the dictionary with an wid (Word List Identifier) and a tag-bit in the length-encoding header byte in the following way:

tag-bit t=0 (untagged)
               |Link|i,c,t,Length|Name|Body|

tag-bit t=1 (tagged)
           |wid|Link|i,c,t,Length|Name|Body|
i: IMMED, c: COMPO, t:TAGGE

The lexicon mask constants, used e.g. in find, now have the following values:

ID Mask Comment
IMEDD 0x80 lexicon immediate bit
COMPO 0x40 lexicon compile only bit
TAGGE 0x20 lexicon tag bit (previously unused in STM8 eForth)
MASKK 0x1F7F lexicon bit mask

Technically "tagging a word" means "compile the wid id in front of the word's link-field and set the lexicon tag bit (TAGGE = 0x20) in the count byte at the word's name address".

Loading the CURRENT library file with #require does the following:

  • define the variable CURRENT that holds the wid for new definitions ("0" for word list FORTH)
  • extend $,n to add a wid field and set the tag-bit of a new dictionary entry, unless CURRENT equals 0
  • redefine CONTEXT as an array
  • extend find and WORDS with tag-bit evaluation
  • make NAME? respect a search order of tagged and untagged words
  • make UNIQUE? search only in compilation context

This extension adds 310 bytes to the Flash size, and requires 6 bytes of RAM. Since applying CURRENT patches the existing binary and applies PERSIST to protect the changes (otherwise a RESET would render the patched Forth kernel unusable).

The optional library word FORTH sets the variable CONTEXT to 0 and thus provides a functionality well known from many other Forth systems.

Using the Word List Feature

Manfred Mahlow implemented 3 different flavors for using word lists in STM8 eForth: basic WORDLIST, traditional Forth-83 style VOCABULARY, and the novelty VOC.

Basics with WORDLIST

WORDLIST implements the creation of word list identifiers (wid) to be used for the direct manipulation of CURRENT and CONTEXT.

The following example session demonstrates this feature:

#require WORDLIST
WORDLIST CONSTANT wid0 ok
wid0 CURRENT ! ok
CURRENT ? -27611 ok
: .S ( -- ) ."  This is .S in the word list wid0 " .S ; ok
words
 wid0  WORDLIST ?RAM CURRENT CONTEXT find IRET SAVEC ... 
.S
 <sp  ok

The new word WORDLIST creates an unique ID (wid) for a word list (i.e. the address of a byte in the Flash dictionary), stored as constant wid0, which gets assigned to CURRENT. The following re-definition of .S uses that wid, to the effect that the new definition remains invisible (WORD) and can't be found (.S still uses the original code).

Assigning wid0 to `CONTEXT changes that:

wid0 CONTEXT ! ok
.S This is .S in the word list wid0 
 <sp  ok
WORDS
 .S ok

The new word list is now on top of the dictionary search order and the new .S is visible and hides .S of the FORTH core. (WORDS always only shows the words of the CONTEXT word list, so the new .S is the only thing it shows here).

The new definition of .S is now visible and active (but it's the only thing that WORDS shows).

By resetting CONTEXT to 0 (that's the FORTH context), the original search order is restored.

0 CONTEXT !
WORDS         
 wid0  WORDLIST ?RAM CURRENT CONTEXT find IRET ...

Old School VOCABULARY

The traditional method emulates the Forth-79 words VOCABULARY, DEFINITIONS and ORDER.

Installation of VOCABULARY also brings some helper words:

#require CURRENT
#require VOCABULARY
\ ...
WORDS
  ORDER .VOC NVM VOCABULARY DEFINITIONS FORTH WIPE CURRENT CONTEXT find IRET ...
  • NVM prevents compiling code to flash while vocabulary in RAM is active (CURRENT > 0)
  • ORDER presents the search order for vocabularies defined by CONTEXT
  • .VOC shows the name of a vocabulary, e.g. CURRENT @ .VOC.

The following session demonstrates the VOCABULARY feature:

VOCABULARY myvoc ok
myvoc ok
words
 ok
ORDER myvoc FORTH ok
DEFINITIONS
: .S ( -- ) ."  This is .S in myvoc " .S ; ok
words
 .S ok
.S This is .S in myvoc 
 <sp  ok
FORTH ok
.S
 <sp  ok
ORDER FORTH FORTH ok

Note that VOCABULARY, unlike the basic WORDLIST feature, also works for temporary vocabularies in RAM.

Namespace Prefixes with VOC

VOC brings the idea of "namespace prefixes" to embedded Forth. The rationale is to keep the vocabulary uncluttered and the source code readable, while using short and targeted words (e.g. C!for writing to a serial EEPROM).

It's even possible to create a hierarchy of namespaces (i.e. by defining a VOC prefix within another "namespace":

#require i2c
i2c DEFINITIONS
VOC eeprom  i2c eeprom DEFINITIONS
  $50 CONSTANT sid
  : C@ ( a -- c )  1 i2c eeprom sid i2c read ;
  : C! ( c a -- )  2 i2c eeprom sid i2c write ;
FORTH DEFINITIONS

The prefixes i2c and eeprom are IMMEDIATE (executed during compile time). In the definition of C! only the literals 2 and sid, and the word write get compiled to the NVM. In a similar way, an I2C library might have "branches" and "leafs" for different types of I2C devices and their features (e.g. i2c rtc, i2c port, etc).

The new words C@ and C! can now be used like this:

$AA 200 i2c eeprom C!
200 i2c eeprom C@ .

The CURRENT and the VOC front-end require a total of 599 bytes Flash and 8 bytes RAM memory. Tthat's often too much for a Low Density device, but on a STM8 Medium Density or High Density device (16K .. 32K Flash ROM) it's really worth it!

Clone this wiki locally