Trace: » forth05.html

[hemmerling] FORTH 5/7

Implementation of your own FORTH System & Applications

FORTH - The Good, the Bad and the Ugly

Is this a Forth at all? :-)

The Good - Valuable OpenSource FORTHs

⇒ See FORTH 3/4.

The Bad - Commercial FORTH Systems

  1. No source code. Means you can´t learn about the implementation of FORTH words & libraries... for better understanding. The application developers are lost!
  2. Compatibility problems, and nobody takes care of it. The application developers are lost!

The Ugly - Toy FORTH

  1. Not beeing developed by
    1. Senior developers with professional skills in compiler design, embedded systems,.. with 5+y experience ( e.g. college / university professors, engineers with much professional industrial experience,... ).
    2. Senior “assembly language artists” :-) with 5+y of experience.
  2. Without free and immediate / fast / quick support on professional level.
  3. Without comprehensive paper / PDF / HTML,... documentation.
  4. Without comprensive comments in the source code.
  5. Without much sample code & sample projects.
  6. Which might be “not in active development anymore”.
  7. Where the developer gets angry if his / her FORTH is considered to be a “Toy FORTH” :-(.
  8. Do most of the 1000 FORTHs have “more than one user” ? Or is it typical, that either
    1. The FORTH has no killer-application at all ( the only developer develops for all the years just FORTH, not a software solution for a real given application / problem ),
    2. Or the only developer has ONE software / hardware application for which he/she is using the self-developed FORTH with his self-developed software application code on top of the bare FORTH implementation?

Valuable FORTH vs. Toy FORTH

  • User experience and feedback: “If you have a good FORTH it is great fun to talk to the computer. Otherwise it is also torture :-(”.
  • User opinions:
    • “You can't write your own FORTH as rookie in compiler design & assembly language coding. That's all stupid what these 'Toy FORTH' developers tell you, that they can do it. Of course you can write a FORTH of 35 words yourself, but what's the benefit?”.
    • “Maybe there are '10000 FORTHs' on GitHub, but as you can see I can't name more than 10 that are really useful. So for me there are 9990 Toy FORTHs ... :-)”.
      • User #1 opinion: “99 % of all Forths I know are simple Toys , this means unfinished proyects. They implement the basic VMachine and then basta. They do not offer a simple Editor embedded in their Forths (as it was in the past a no question theme) they do not offer an Assembler or Disassembler :-(. Old style Forths as the ones I used to learn from in the past, had all those tools and much more. They offered Multitasking right from the start, vocabularies, block-file or os-file interfaces, direct video access, etc etc etc...:-)”.
      • User #2 opinion: “Forth has @ and !. You've missed the OP's point; it's the underlying architecture (for instance, Java and the JVM) that lacks direct memory access through pointers. That makes it hard implementing Forth on these types of systems”.
      • User #3 opinion: “Java has ByteBuffers, which can either be based on a Java byte array or on a directly-mapped file. That's everything a Forth needs”... “But I thought the JVM was sandboxed?”, “It is. ByteBuffers can only reach into the memory allocated for that buffer, but that's all that a Forth implementation needs. You could put the entire dictionary into a ByteBuffer. That'd be nice from the point of view of saving a precompiled image”.

Write your own FORTH Compiler

Slogan "Roll your own programming language - Roll your own Forth"

  • Inspired by the articles in Dr. Dobb's Journal of Software Tools “Roll your own DOS extender” ( Volume 15, Number 10, October 1990 + Number 11, November, 1990 ) and “Roll your own object-oriented language” ( Volume 15, Number 11, November, 1990, and online articles like InfoWorld "Learn Linux by rolling your own distro", 2017, I imagined the slogan “Roll your own programming language - Roll your own Forth” :-).

Implementation Types

Threaded Code
Tethered FORTH
  • Interactivity it just done on the host. The tethered Forth is just the compiled code.
  • High-level languages ( C, Lua,.. ) might be involved, for rapid development.
Native FORTH with Threaded Code
  • 100% pure assembly language. No high-level language involved ( C, Lua,.. ). This is how FORTH started, in the mid-1970th, with the first implementations for 8-bit microprocessor boards.
Native tethered FORTH
SRT / NCI FORTH ( Subroutine threaded / native Code inlining FORTH )
  • SRT/NCI means “subroutine threaded / native code inlining”.
    • “Subroutine threaded” specifically means that the code overall consists of subroutine calls to other words written as native code.
    • “Native code inlining” means that smaller words that are called that are able to be inlined are incorporated into the body of code which calls it as native code, eliminating the need for subroutine calls in those cases.
  • Example: zeptoforth.
FORTH which may create Application Executables
  • Examples: “Win32Forth”, “calForth”.
Resources
  • BYTE Magazine.
    • C5J Harris* K.* “FORTH Extensibility: Or How to Write a Compiler in 25 Words or Less*” Byte (Ausfust* 1980). pp. 164-184.
  • Narkive Archive "comp.lang.forth", Thread "TTC v DTC" - “token-threaded code versus direct-threaded code”.
  • Suggested reading: “Moving Forth: a series on writing Forth kernels” #.
  • An expert told me:
    • “Basic forth is all real mode. Mixed code and data”.
    • “That's not the only way to do it though. I've started doing Forth in web assembly which is part of the JavaScript engine of all browsers. This is a protected memory model which means you can't execute data directly. In those cases your pointers are really just offsets into a table of primitive functions. Forth runs completely fine in this model”.
    1. “ALForth” ⇒ See Forth 4/6.
    2. “f2u” ⇒ See Forth 4/6.
    3. “My Vectrex has a 6809 and 1KiB of RAM. Vectrex Forth (aka CamelForth 6809 with extensions) uses 1/2 of the RAM for buffers, TIB, etc. and after the BIOS reserved RAM is avoided, there are 256 bytes dictionary space available. Given Vectrex Forth is cross compiled and the whole compile+send process takes 1-2 seconds, 256 bytes is plenty for ongoing development”.
    4. With very small systems, one is better off using a tethered Forth such as Mecrisp-Across (for programming the MSP430 with one of the TI Tiva boards, I don't remember which) rather than a hosted Forth, which will inevitably take up some space. Tethered Forths can operate in very small spaces yet provide interactivity, and thus provide the best of both worlds”.

Implementation Suggestions "How to start"

  • A FORTH expert told me his impression of the proceedings of hobbyist FORTH implementators from the maker-scene of the last 40 years: “Most developers implement FORTH-79 and then pick and choose what they want from later versions”.

Implementation Ideas

  • Several FORTHs on a single computer may share data by a common ( extra? ) stack. Either the FORTHs may run concurrently, or just one FORTH is active and by a semaphore, a “baton” is passed to the next-active FORTH. By this, some code of an application might be written in different FORTH dialects ( proprietary, fig-FORTH, FORTH-79, FORTH-83, fig-FORTH, ANS94-FORTH,..).

The Input, Input Processing & Compile Process ( Experts told me so )

  • Delimiter - not just “SPACE”.
    • “Generally all white space is treated as a delimiter. You can write words that parse from the input stream so this can be changed. The parser is usually about skipping all whitespace before the word starts. It will then read in every non whitespace character. The parser usually also recognises \ as the start of a comment and skips to the end of the the line and starts reading from there”.
    • Comments as delimiters: “Not with the way they decided to add line comments but I don't think this was the original comment system of Forth but added in the 80s. The original comment was the ( word which is an immediate words (a macro) which reads the input stream until it encounters a ) character. These kinds of comments these days are restricted to stack effects diagrams but there is nothing special about them, they are just comments”.
  • Input parsing:
    • The parser processes a whole line of input, not character-by-character as the characters are sent: “The parser can work on the line after the return has been entered. There are some subtle alternative approaches but this way is the simplest”.
    • “The input stream will go into a buffer and you read each word and execute the word. If you encounter a special word : it will go into compile mode and the rules change, macros (immediate words) like ; will be executed immediately, other words will be compiled into the heap (compiled into a definition)”.
    • “Forth usually processes input a line at a time and ways until the line is entered before processing it. This gives the user a chance to edit the line before executing it. Forth processes the text stored in the input buffer which is usually TIB but it can be changed. Each word is parsed, looked up, if it is marked immediate in its header it is executed immediately, otherwise it depends on the compilation state variable. If it is in compilation state then the xt of the looked up word is compiled into the currently definition. If it is in interpret state then it is executed”.
    • “You can be more immediate if you want to implement your own line editor. This is usually a bit crude unless it handles arrow keys and insertion and deletion. However it could end up more powerful than relying on the OS readline functionality”.
  • Word processing.
    • “The words are processed as a stream”.
  • Stack.
    • “The stack is only used for interpreting and also for calculating loops and branches”.
  • Heap Allocator.
    • The heap pointer (bump allocator) is where you can allot new data objects. It can also allocate new words”.
    • “I think you can learn the majority of this flow from this article which I think it's really good. It's about implementing a Forth using Python” ⇒ OpenBookProject, Chris Meyers and Fred Obermann "Python for Fun", Chapter "FORTH - A simple stack oriented language".
    • Execution of FORTH words immediately, before end of line is reached:
      • “Peter Jackaki does exactly this with his Forth. I am very interested in his approach which has many advantages over traditional Forth”.
      • “The main disadvantage of traditional Forth is that it has two modes: interpret mode and compile mode. This creates complications like no loops or branches on the command line. Peter Jackaki's TAQOZ forth ( = Tachyon Forth ) compiles everything as it comes in. He watches the return and executes the compiled code. This changes a few things about Forth but I think it makes things simpler”.
  • Word list:
    • “The normal scheme is to store the name of the word in a Pascal style string. With a length byte followed by characters. But you never need more than 31 characters in a manner so the upper bits of the length byte are used for flags. But there's not reason not to allow names of any length, it was a practical limitation based on implementation. So you'll never see a common Forth word longer than 31 characters in any spec. The only real reason was that Pascal strings (ie “counted” strings) are faster to compare than null terminated strings. But remember you need to mask the bottom five bits the get the length. The upper bits are used to mark HIDDEN, IMMEDIATE and whatever other flag (compile only?).
    • HIDDEN words: “A word that is currently being defined is marked hidden so it doesn't appear in searches. This means that normally a word cannot call itself. This also means that if a word is redefined it can can refer to the older version of the word in its body. When the defining is compete at ; the word is marked as visible. Recursion is not common in classic Forth (but is in colorForth). To make self referential code you use the RECURSE word. It works by making the currently being defined word not hidden. But you have the ability to define new words. They happen in RAM. ROM based words that are in the dictionary are always not hidden”.
    • “The dictionary needs to span ROM and RAM so a linked list seems the right way to go. Or an array needs to exist in RAM and be initialised with items in ROM but also able to take new items defined in RAM”.
    • Facebook "FORTH PROGRAMMING LANGUAGE 21st CENTURY", Thread "WORDS" - Discussion about how to create own WORDS word, about “SEE WORDS”,...

Inner Interpreter, Outer Parser, Compiler

Word Compilation

How to handle numbers with compiled words?
    • Expert:
      • “Your executing compiled forth code makes up a run-stream. Most forths, when a number is compiled, either explicitly with LITERAL or implied during compilation, will compile a call to code that takes the compiled binary value from the run stream and put it on the stack, followed immediately by said binary value. If this code has a header it’s usually '(lit)' or '(literal)' or the like. I’ve never seen a typical forth store the number as text”.
      • “As a practical example, most forths will compile : forty-two 42 ; as something like ENTER (literal) 42 EXIT, where ENTER and EXIT (maybe as other names) handle nesting (representing : and ; ). When (literal) executes, it grabs the 42 right after itself, puts it on the stack, and makes the interpreter skip over to the next item in the run-stream.”.
    • Another expert:
      • “There are minimalistic token based Forths (I know because I work on one *g* and have seen some) that store numbers < 128 directly as value and every word (limited to 127 unless you extent this) has a dedicated value above 128 (bit7 set). All that only makes sense when you are on a 8bit system, however”.
    • Thread.
      • Elizabeth:
        • “The classical approach is to preface the number (which is binary) with the command LITERAL (or 2LITERAL, etc.) which will push the following value onto the stack”.
        • “The sequence in the text interpreter should be, roughly:
          1. Ooh, this is a number! It's converted to binary and put on the stack.
          2. Am I in compile mode or interpret mode?
          3. If compile mode, compile the number in the next place in the dictionary, preceded by the command LITERAL, 2LITERAL, or whatever is appropriate.
          4. Otherwise, we're done, and the number is on the stack.
        • “Note that the dictionary is searched *before* number conversion is attempted.”.
        • “The text interpreter always behaves the same at the outset, and the effect of being in interactive vs. compile mode takes place in step 2 above. Step 1 is always the same”.
      • Expert:
        • “2LITERAL can simply defined as:
          : 2LITERAL SWAP POSTPONE LITERAL POSTPONE LITERAL ; IMMEDIATE
          
      • Elizabeth:
        • “Yes, that'll work, but is slower and one cell longer. Since 2LITERAL has both values on the stack it's easy to just compile the address of 2LITERAL and do a 2! (with 2@ as the run-time)”.
      • Expert:
        • “I also came up with further goodies that never were meant to be included in the original (EEPROM based) implementation. Yet, these help me running the Forth 2012 double test suite and that's what really matters:
          : 2constant SWAP CREATE , ,
          DOES>
          DUP @ SWAP CELL+ @ ;
          : 2variable CREATE 2 CELLS ALLOT ;
          
    • Expert:
      • “FIGnition Forth is a byte-coded Forth, so you might want to see how it's done there (but basically it's the same as the above). FIGnition Forth also supports inline strings, e.g.
        create myStr "Hello World!"
        : str. ". ;
        myStr str
        Hello World!
        
      • “FIGnition Forth also has a number of 'C' style string operators”.
      • “In FIGnition forth, there’s a special token for zero as it’s used so often; a token for 8-bit unsigned literals (which are followed by a single inline byte); a token for 16-bit literals and a token for 32-bit literals (dliteral) which are used for integers and floating-point”.
    • Expert:
      • “If you can spare some bytecodes, and asm code, do combo.... 0 to 10 compile to one of 11 specialized byte codes. Numbers from 11..255 to bytecode meaning “8 bit unsigned int follows”. Another means 16 bit unsigned int, and one for 32-bit. I suspect prevalence of 0..10 outweighs all other constants by tremendous ratio. If you have a 'negate' bytecode, use it too, else change above to read 'signed'”.
    • Expert:
      • “My 6502 and 65818 forths have a similar feature, if code will not run in the zero page or bank 0, 'fast literals' can be enabled for values less than 256 or less than 65536, respectively”.
    • Expert:
How to handle parser & interpreter recursion when dealing compiled Forth words?
    • User comment: “mmap Linux syscall ... allows you to allocate a chunk of memory you can write to and execute in”.
    • User comment: “Some of the earliest Forths by the original creator were done the same way the interpreted and we actually looked at the textual description of a non primitive word and it passed out the word to space and then it hatched it and looked it up in a hash table so interpretation was quite slow but it still works in the multi-user system!”.
    • User comment: “If you are going for the ANSI package... the only reason for a recursive call of the parser is EVALUATE or LOAD/–>/CONTINUED, in a block context”.
    • User comment: GitHub "agsb / immu" - “An implementation of Forth with inner interpreter using extended indirect thread code and a dictionary made of machine independent vocabularies”.
    • User comment: “Study how QUIT works (outer text interpreter) and how compiled colon definitions are executed (inner interpreter) ie. unless you are set on making a native code compiler. Read $3.4 of the Forth Standard”.
    • User comment: “You should compile your defintions into some type of threaded code - subroitine/direct/indirect. After that you'll get a plain array with the instructions that can be executed by your interpreter. For example, for direct threaded code you need yet another primitive to invoke the collon definition and to exit from it”.
      • “So, for DTC you compiled code might look like this:”
        :a0 aPrimitve00 aPrimitive01 EXIT
        :a1 aPrimitive10 aPrimitive11 EXIT
        :a2 CALL a1 CALL a0 EXIT
        :a3 CALL a2 EXIT
        
      • “For ITC:”
        :a0 CALL &aPrimitve00 &aPrimitive01 &EXIT
        :a1 CALL &aPrimitive10 &aPrimitive11 &EXIT
        :a2 CALL &a1 &a0 &EXIT
        :a3 CALL &a2 &EXIT
        
      • “For STC (generating machine object code is required):”
        :a0 call aPrimitve00 call aPrimitive01 ret
        :a1 call aPrimitive10 call aPrimitive11 ret
        :a2 call a1 call a0 ret
        :a3 call a2 ret
        
    • GitHub "adumont / hb6502 / forth/doc/examples.md" - “AlexFORTH Examples”, “Debugging & Tracing”.
    • User comment
      • “I think you don't need to have recursion in the parser (actually in the outter interpreter). In your example: User type first line and hit return:
        : a0 aPrimitve00 aPrimitive01 ;
        
      • “Upon getting the return, you parse the line: (I don't know if it was FB would trimmed the spaces, but you need a space between : and a0 I assume you want to create a new work called a0. ”.” is it a word in itself”.
      • “Your parse reads the first word, it finds ':'. It will look it up in the dictionary, and run it (at this point the outer interpreter is in execution mode/state). Upon execution, colon will create a new word header (taking the next token “a0” to add it's name to the disctionary), and switch to compiling mode (or state). The outer interpreter will continue parsing, but now as it is in compiling state, it will add every word token to a0 definition (it will compile a0). Until it parses ';'. ';' is an immediate word, so even in Compiling state, it will execute it: this will reveal the new compiled word 'a0' in the dictionary, compile a call to 'EXIT' at the end of the definition of a0 and switch the state back to execution state”.
      • “Second line is then entered by the user:
        : a1 aPrimitive10 aPrimitive11 ; 
        
      • ”, when you hit return, it will repeat the same process. Then we get to 3rd line:”
        : a2 a1 a0 ;
        
      • “Same process repeats. ':' creates a new header for word a2 in the dictionary. Then a1 is parsed. As we are now in compile mode, the token of a1 is added to the definition of a2. Next, similarly, a0 is parsed and added to the definition of a2. Then ';' is parsed (and executed , like exlained above). So definition of a2 should look like this (in the internal compiled form):
        [token of ENTER] [token of a1] [token of a0] [token of EXIT]
        
      • “As you see, the outer interpreter switches state between compile/execute modes. : and ; are executed and allow to switch mode. In compile mode, new definitions are created (leading to a list of tokens that your 'inner interpreter' should be able to 'follow'”.
    • User comment “There seems to be some people who have a very fixed notion about the fourth isome but it has to be very standard compliant. And this often implies that there's only one way to do it and the memory layout has to be exactly that way because all the compilation words assumes certain things. But as I pointed out to your current interpretive approach for you to find words is totally fine in this house more did it in his first version and published variant of forth. Now when you start using it and you start looking at efficiency you might look more into the ways of doing optimizations etc and you find and you learn a new ways of doing things and maybe getting closer to the 'typical' implementations of forths. The Forth REPL ( = “read–evaluate–print loop” ) as you say just stands still and it only reacts on the inputs giving outputs of any comes back to waiting for more input”.
    • User comment: Go and study CamelForth ( GitHub "wa1tnr / camelforth-rp2040-aU" )!.
    • User comment: “Getting the two interpreters right is indeed key. Maybe I intuitively escaped that trap by choosing to make my first Forth subroutine-threaded, i.e. no inner interpreter; compiled code is just a sequence of JSRs. It took me a while to understand the point of direct or indirect threaded code where you save the JSR and just string addresses together. In any case, conceptually I find it easier to wrap my mind around these concepts when I leave C out of the equation for the time being and look at native Forths where you just have memory and addresses - that's what Forth was initially designed for. Shameless plug: The MSDOS and C64 VolksForth sources are easily readable on github these days”.
In Forth interpret mode, what is the proper reaction to a Forth word to be used in Forth compilation mode?
    • User comments:
      • “As for LIT, it could be coded as”
        : LIT R> DUP 4 + >R @ ;
        
      • “So you understand that it would read the 32-bits following the LIT that was encountered and in the process return to 4 bytes after that. No problem, especially since I don't 'interpret' but always compile. But now imagine you typed in LIT - in fact I will try this in Mecrisp on the Pico and see what happens.
        R00# LIT 1234 ---
        R01# .S ---
        0 : $000004D2 1,234
        
      • “So Mecrisp must have invalidated LIT or something but in gforth”:
        LIT 1234
        :1: Invalid memory address
        >>>LIT<<< 1234
        Backtrace:
        
      • “So gforth didn't like it. In fact LIT should really have a compile-only flag like IF and DO etc.”.
    • Expert comment:
      • “Standard defines the exception code -14: “interpreting a compile-only word”. This probably would be the most corect behaviour”.
      • “ANS Forth and Forth 2012 have a bad exception model because it relies on a relatively fixed set of numbers and is hard to extend”
    • User comments:
        • “IP and W are used to progress along the 'thread'. IP is a pointer in the thread. The address pointed to by IP is stored into W, then IP is incremented by 2 ( pointing to the next address in the thread ), and finally the CPU jumps to the address we have just stored in W with a JMP (w). These two variables are like internal forth registers used by the inner interpreter ( although in my implementation, they are not CPU registers per se, as the 6502 doesn't have enough registers )”.
        • “The difference is IP moves along the thread, W points to actual code. (at least in my implementation, remember my forth is DTC, that is direct threaded code). Say IP is for example $8000, and content at $8000 is $7AB0, address for the code of the primitive word LIT. NEXT reads the two bytes at IP (that is $7AB0) , and stores them in W. Now it advances IP by 2,so IP now is $8002. Now NEXT will jump to (W), that is it jumps to $7AB0, which is the address of the code for LIT. So the CPU them continues to run from there, in LIT. LIT is an interesting word, it will read the two bytes at IP, and pushes them to the stack, then advance IP again by two (so now IP is $8004). It will then jump to NEXT”.
      • “In Forth 77 (fig semantics) it will complain that you're not in compile mode”.
      • “In Polyforth the outer interpreter ALWAYS compiled, and then executed the code when you hit return. Yes, you couldn't enter multi-line code except in a screen.
      • “Forth 83 dropped state because Forth Incorporated were XXX. It wouldn't have hurt anything to keep it, in their implementation it would have always been true... Other Forth-83 implementations simply didn't expose state in a user variable and immediate words had to assume they were compiling”.
    • User comment:
      • “One approach (not saying it was common, I don't think I ever saw it on a 'standard' Forth under any given standard) was to have the 'compile-only' flag be bit5 of the leading status/length byte, and mask the top three bits when searching the dictionary in compile mode but only mask the top two bits when searching the dictionary in interpret mode, so 'compile-only' words are LITerally invisible in interpret mode”.
      • “If you had a compiler-optimized word, like a subroutine threaded Forth compiling the opcode(s) for DROP directly rather than a JSR DROP, first the Interpret-mode version of the word would be defined, then the compiler-only version”.
    • User comment: 'See the example of S”':
      : S" \ comp: ( -<string">- ) run: ( -- addr len )
      \ *G Compiletime: s" parses the input stream until it finds the next " and
      \ ** compiles it into the current definition. Runtime: s" leaves the address
      \ ** and the length of the compiled string on the stack.
      STATE @
      IF COMPILE (S") ," \ see also ." and .(
      ELSE ((S"))
      THEN ; IMMEDIATE
      
    • real-Forth & fastForth.
        • “This is real-Forth, a 16 bit descendant of fig-Forth, for the 8086 et al. It runs on top of MS-DOS or an MS-DOS clone such as Free-DOS. Like fig-Forth, it is a traditional indirect threaded Forth. It uses screen files to emulate Forth's traditional raw block access to floppy or hard drives.
        • “It includes fastForth, a 32 bit direct JSR/BSR threaded Forth for the 68000 and Atari ST”.
        • “My 68000 Forth, fastForth, is direct threaded, so LIT doesn't exist. Instead, LITERAL compiles one of several opcodes and a literal value”.
        • Well, on my 8086 realForth, LIT causes exception 06, which forces aborting the program and returning to the OS, FreeDOS”.
        • “Another solution to the problem for systems that have LIT, make it headerless. My cross-compiler allows making individual words headerless. I believe that the poly-Forth cross compiler does also. This would certainly save ROM space”.
Where is the "thread" to which the instruction pointer is incremented in the so-called "next()" operation of an inner interpreter?
    • User comment: “The example you mentioned, uses Indirect Threaded Code, due to this, even simple primitives must have a thread, but this 'primitive' thread has only 1 cell - which contains the address of C-function. The thread of colon definition is the array of cells where the addresses of 'primitive' threads or other colon threads are stored. It looks like a bit tricky, but it is a nature of indirect threading. '...but if a FORTH word is just extracted from the input stream...' This is done by outer-interpreter which is typically implemented in pure Forth. So, your ForthVM should be executing word INTERPRET to perform the following actions:
      - accept input from user
      - find word in dictionary
      -- if found - check immediate flag and execute the word if so;
      --- if word is not immediate - check VM state:
      ---- if state is execution - execute the word
      ---- if state is compilation - compile the word into threaded code
      -- if word is not found
      --- try to interpret it as a number
      ---- if it is a number - check VM state
      ----- if state is execution - put the number to the top of the stack
      ----- if state is compilation - compile number as LITERAL
      ---- if the word can't be interpreted as number - display error message
      - repeat all above forever
      
    • User comment “Let's see... indirect threading is kind of having a list of subroutines to call... You can use the addresses... or a token representing each subroutine. As with lists, there are a couple ways to say how long it is. You could have the length somewhere, like at the start of the list, or use a list terminator value like -1. I think it's common in indirect threaded forths to use a token for a special subroutine called NEXT as the list terminator. Then they add in other special subroutines for doing branching... But you don't have to do it that way. I think there are many solutions to this problem”.
    • User comments:
      • “A thread is when executing a compiled word, so that each compiled execution token in turn is executed .... if it is a primitive, it is executed (however that may be) and then returns to the execution of the next execution token, if it is a compiled word, it is executed by pushing the position in the current word's execution onto the rack (the “R” stack), and then starting to execute each execution token in THAT word in turn”.
      • The execution token could be
        1. the address of executable code, which for a compiled definition is an executable stub in front of a list of execution tokens – this is “direct” threading
        2. A pointer to an address that CONTAINS an address of executable code, where for a primitive it just points to the following address, while if it is a compiled definition, it points to the routine that starts interpretation of the list of execution tokens – this is “indirect” threading
        3. A bytecode that refers to executable code, which for a compiled routine is FOLLOWED BY the address of the compiled list of execution tokens – this is “token threading”
        4. Either a reference to the address of executable code OR a reference to the address of a list of execution tokens, with some bit (typically either the high bit or low bit) in the reference letting the interpreter tell which is which – this is “bit” threading
        5. Even either a bytecode or the first byte of a two byte reference to the address of a list of execution tokens, with some bit letting the intepreter tell the difference between the two – this is “token/bit” threading.
      • In the C-based Forth's that I have seen, they use an indirect threading model, so that the primitives are functions, the entry that is pointed to is a pointer to a function, and the dictionary entry for a primitive word is just the word entry followed by the pointer to the function. So the compiled word is mostly a vector of pointers to functions.
      • If you have a language that executes Forth words in the outer interpreter but doesn't have the : and ; words to define new functions, that's not really a Forth yet ... it's more a 'pre-Forth'.
      • “One reason for the strategy of bringing up a minimal Forth that supports programming IN Forth is that it supports use of writing small words and then testing them thoroughly on the command line to confirm that they work under their boundary conditions”.
    • User comments.
      • Reaging suggestion: “R. G. Loeliger: Threaded Interpretive Languages: Their Design and Implementation”.
      • “In my case the main thread is the outer interpreter. When “booting”, the inner interpreter starts with the outer interpreter thread (code) and stay 'in its loop'”.
    • User comments:
      • “It seems to me that your Forth so far doesn't contain a concept of compiled code, in the form of either arrays of execution tokens (unless you are looking at native subroutine-threaded code). Your IP is then pointing into one of these arrays of execution tokens, and NEXT fetches the execution token that IP points to, increments the IP, and executes the execution token just fetched”.
      • “The one thing that I stumble over is that in your earlier description of how compiled words are executed, you mention searching. You wrote: 'Now to execute the C code of +, my not yet implemented code must search again all the wordlists again, to find each forthWordID stored in the list of forthWordIDs for this Forth word defined at runtime, and then there to execute the corresponding C function'. My feeling is that execution of compiled Forth code should not involve searching through word lists. My guess is that the solution to your 'where is the thread' question lies buried somewhere around this search through wordlists at runtime, or rather, around getting rid of this search at runtime. I think then the concept of the thread will emerge rather naturally”.
  • User comment: Reading suggestion:
General

Data output according to current BASE Setting

Stack Organisation & Stack Checking

    • User comment “How would you implement such a construct? With #defines or different typedefs that define the CELL size? I suppose you might be able to have a return stack that is 32-bit or 64-bit (for the longer addresses the system requires), but a 16-bit data stack for the calculations. But that would make it hard for the data stack to have an address that is executed (ie - EXECUTE), NONAME, or things like jump tables. Or have a 32/64 bit stack, with a setting that controls the “bit-ness” that the operations would look at?”.
  • Idea to avoid stack checking with FORTH written in C:
    • Switch ( #define ) to compile for developer and production version.
      • Developer version:
        • Full error checking with the developer version.
        • Stack checking of start-of-stack and end-of-stack, no cyclic stack.
      • Production version:
        • Differ between:
          • Compiled code & primitives:
            • Let's assume that the code ist well-tested.
            • Less error checking.
            • Cyclic stack, so that there is at least is no memory corruption.
          • Words entered interactively at FORTH prompt:
            • Full error checking.
            • Optional no access to the full interactive vocabulary, but just to a limited interactive wordset.
          • Words generated by reading a FORTH text file ( “program.f” ) at start of execution of FORTH:
            • Let's assume that the code ist well-tested.
            • Less error checking.
            • Access to the full interactive vocabulary.

Java VM Bytecode => Forth VM Bytecode Mapping by Chen-hanson Ting

The minimal FORTH System "Minimal Forth Machine" by Mikael Patel

Primitives
Primitive Stack effects Description
>r ( x – )
r> ( – x )
1+ ( x – y )
0= ( x – flag )
nand ( x y – z )
@ ( addr – x )
dup! ( x addr – x )
execute ( addr – )
exit ( – )
May be included because of hardware considerations
drop ( x – )
dup ( x – x x )
swap ( x y – y x )
Resources

The minimal FORTH System "sectorforth"

Primitives
Primitive Stack effects Description
@ ( addr – x ) Fetch memory contents at addr
! ( x addr – ) Store x at addr
sp@ ( – sp ) Get pointer to top of data stack
rp@ ( – rp ) Get pointer to top of return stack
0= ( x – flag ) -1 if top of stack is 0, 0 otherwise
+ ( x y – z ) Sum the two numbers at the top of the stack
nand ( x y – z ) NAND the two numbers at the top of the stack
exit ( r:addr – ) Pop return stack and resume execution at addr
key ( – x ) Read key stroke as ASCII character
emit ( x – ) Print low byte of x as an ASCII character
Variables
Variable Description
state 0: execute words; 1: compile word addresses to the dictionary
tib Terminal input buffer, where input is parsed from
>in Current parsing offset into terminal input buffer
here Pointer to next free position in the dictionary
latest Pointer to most recent dictionary entry
Compiler
  • : and ; provided.
  • NO immediate, [, or ].
What can I do?
  • “hello, world” of course :-).
  • BEGIN, WHILE, REPEAT, UNTIL, DO, LOOP.
  • Variables.
  • Stack debugging (.s, etc. )

The minimal FORTH System of GreenArrays' GA144

#(oct) Primitive FORTH Word #1 FORTH Word #2 Stack effects Description
Jump Instructions
00 Ret ”;” ”;” Jump thru R (destructive)
01 Exec “ex” ” ;:” Jump thru R, save P in R
02 Jmp “jump” “jump” Jump thru I ( The opcode is not used explicitly. The compiler generates it )
03 Call “call” “call” Jump thru I, push current address to R ( The opcode is not used explicitly. Referencing any defined word generates a call )
04 Unext “unext” “unext” Jump to slot 0 ( Discards the address left by for )
05 Next “next” “next” If R is non-zero, jump thru I and decrement R. Otherwise pop R
06 If “if” “if” Jump thru I if T is zero
07 MinusIf ”-if” ”-if” Jump thru I if T is positive
Memory Instructions
10 FetchP ”@p” ”@p+” Fetch thru P, increment P
11 FetchPlus ”@+” ”@+” Fetch thru A, increment A
12 FetchB ”@b” ”@b” Fetch thru B
13 Fetch ”@” ”@” Fetch thru A
14 StoreP ”!p” ”!p+” Store thru P, increment P
15 StorePlus ”!+” ”!+” Store thru A, increment A
16 StoreB ”!b” ”!b” Store thru B
17 Store ”!” ”!” Store thru A
ALU Instructions
20 MultiplyStep ”+*” ”+*” Multiply step: add S to T if A0=1 then shift T and A right
21 Times2 “2*” “2*” Shift T left
22 Div2 “2/” “2/” Shift T right; sign fill
23 Not ”-” ”-” One's complement T
24 Plus ”+” ”+” Add S to T (discard S)
24 Add S to T with carry ( Requires bit 9 of P be set )
25 And “and” “and” Bit-wise and of S and T
26 Or “or” “or” Bit-wise exclusive-or of S and T
27 Drop “drop” “drop” Discard T
Stack Instructions
30 Dup “dup” “dup” Create a working copy of T
31 Pop “pop” “pop” Fetch R (destructive)
32 Over “over” “over” Fetch S (non-destructive)
33 ReadA “a” “a” Fetch A (non-destructive)
34 Nop ”.” ”.” Do nothing
35 Push “push” “push” Push T into R
36 SetB “b!” “b!” Store into B ( Be careful to distinguish b! and !b )
37 SetA “a!” “a!” Store into A

The minimal FORTH System Proposal & Comparison by Peter Knaggs and Paul E. Bennet

The minimal FORTH System Comparison by Paul E. Bennet
Type Ting Brinkhoff Plichota GA-F18 Stack effect Comments
Memory Access C@ C@
C@ C@
@ @ @ @
a
@a
@p
@b
@+
!p
!+
!b
a!
b!
! ! ! !
Math + + + +
- -
+*
2*
* *
/
2/
MOD
Logic AND NAND NAND and
OR or
XOR XOR
INVERT
Stack DUP DUP dup
SWAP SWAP
OVER over
DROP DROP drop
>R >R push
R> R> pop
R@
Control IF 0BRANCH IF if
-if
ELSE
THEN
BEGIN
WHILE
REPEAT
AGAIN
DO
LOOP
EXECUTE ex
<name>;
<name>
unext
next
EXIT EXIT
Defining : :
; ; ;
CONSTANT
VARIABLE
I/O & Comms IN IN i/o
OUT OUT
Other DODOES LIT
data
- - -u
- -l-
–lu
-d- -
-d-u
-dl-
-dlu
r- - -
r-l-
r-lu
rd- -
rdl-
rdlu
The minimal FORTH System Proposal by Peter Knaggs and Paul E. Bennet
Type Word Name Stack effect Comments
1 Memory Access ! store
, comma
@ fetch
ALIGN
ALIGNED
CELL+ cell-plus
CELLS
C! c-store
C, c-comma
C@ c-fetch
CALIGN c-allign
CALIGNED c-alligned
CHAR+ char-plus
CHARS chars
2 Arithmetic + plus
* star
2* two-star
*/MOD star-slash-mod
- minus
/ slash
2/ two-slash
MOD
3 Logic 0= zero-equals
< less-than
AND
INVERT
TRUE
LSHIFT l-shift
= equals
> greater-than
OR
XOR x-or
FALSE
RSHIFT r-shift
4 Stack DUP dupe
SWAP
>R to-r
R@ r-fetch
DROP
OVER
R> r-from
ROT rote
5 Flow Control IF
THEN
WHILE
REPEAT
DO
I
tick
ELSE
BEGIN
AGAIN
UNTIL
LOOP
J
EXECUTE
6 Definitions : colon
CONSTANT
CREATE
; semicolon
VARIABLE
DOES> does
7 Device KEY
EMIT
KEY? key-question
CR c-r
8 Tools ( paren
.S dot-s
\ backslash
The Core Definitions of FIGnition FORTH ( Bytecodes implemented in Assembly Language )
Word Name Stack effect Comments
(lit)
execute
(branch)
(0branch)
(loop)
(+loop)
(do)
i
leave
and
or
xor
«
»
;s
(does)
r>
>r
r
0=
0<
+
d+
minus
dminus
over
drop
swap
dup
@
c@
!
ic!
vram
clock
at
cls
.hex
edit
list
trace
plot
blk>
>blk
>port>
spi
Resources
      • “Peter Knaggs and Paul E. Bennet: Minimal Forth ( PDF ).
        • “A Proposed minimal word set”.
        • “Comparison of minimal word sets of Ting, Brinkhoff, Plichota, GA-F18”.
  • GitHub "uho / minimal" - “Minimal Forth Workbench” by Ulrich Hoffmann.
    • “Their aim is to define a Standard Forth subset suitable for educational purposes. For this they propose to cut down the number of Forth words initially explained to Forth newcomers and only stepwise introduce new concepts such as number output, compiling words, strings, file access, exceptions, ...”.
    • “This package - the Minimal Forth Workbench - allows to experiment with different sets of primitive (i.e. predefined) definitions in order to further elaborate on Paul's and Peter's ideas”.

General Compiler Building Resources

FORTH System Building Resources

Build Tools for FORTH Virtual Machines ( FVMs )
FORTH in LUA
Some other minimal FORTH
Resources

Excursus: C Data Types

Excursus: Use of SQL Databases to store FORTH Words

Write your own FORTH Code - FORTH Code Style

Traditional FORTH

"!", "C!" ( of fig-FORTH ) versus "=" and "\=" ( of pre fig-FORTH )

  • ATARIWiki.org "CoinOp FORTH" - “Coinop forth was developed in Atari coinop division ..., based on DECUS forth for the PDP-11.. Later, as Forth became popular and fig forth came out, we did a port of it to the 800 as well. A major visible difference is that fig forth uses the new operators for stores ( ! and C! ) rather than the original ( = and \= ) which DECUS, coinop, Colleen forth does”

Readability

Stack-Orientation

  • Traditional FORTH = Stack-oriented, like with RPN language of HP pocket calculators.

Style Conventions

  • Still the reference #1: Leon Brodie's “Thinking Forth”, appendix E, “Summary of Style Conventions”.
  • Additionally , experts told me about FORTH style conventions:
    • Words in parenthesis, i.e. (NAME) ⇒ Please don´t use these words in your application code!

Modern FORTH

1Kbyte Screens / Blocks

  • See FORTH 2/4 ⇒ “The FORTH 64×16 Screen Paradigma”.

Array Words

  • For 50 years, array words were not delivered with standard traditional FORTH implementations :-(.

Batch Processing

  • Batch processing example, output is “1 3 5 2 4 6”:
    vocabulary voc1
    vocabulary voc2
    voc1
    definitions
    : txt1 ." 1 " ;
    : txt2 ." 2 " ;
    voc2
    definitions
    : txt1 ." 3 " ;
    : txt2 ." 4 " ;
    forth
    definitions
    : txt1 ." 5 " ;
    : txt2 ." 6 " ;
    voc1 txt1
    voc2 txt1
    forth txt1
    voc1 txt2
    voc2 txt2
    forth txt2 
    

Built-in Web Server

  1. Forth2020 will be shipped with a little Web Server.
  2. Server written with RETRO FORTH.
    • The OpenSource FORTH ForthWorks by Charles Childers "RETRO" ( RETRO Forth ).
      • “A clean, elegant, and pragmatic dialect of Forth. It provides a simple alternative for those willing to make a break from legacy systems”.
      • “You can view the glossary via Gopher or HTTP. This is served off the latest documentation in the repository by a server written in RETRO”.

Capitalisation, or not?

  • Experts told me: “This is a legacy issue. Classical FORTH is supposed to be case insensitive. In practice it's easier to be case sensitive but some Forths use DUP and others use dup”. The experts suggest to assume sensitivity and expect uppercase, and to implement such with new FORTH implementations.
  • LMI WinFORTH is case senstive.

Clean Code

  • Experts told me: “Most forth could be better written for clarity. My view remains that it is by far clearer to read than assembly language (which was the dominant language on microcomputers in the early 80s) but was beaten by the ALGOL languages as 16 bit architectures took hold. Algol based languages permits a much simpler making of algorithms to code because of its built-in control and data structure. Forth has no data structures, requires composition of small code snippets. It discourages multiple parameter passing. It discourages intermediate values. It discourages nesting. The end result is a language which makes machine code much easier but is lower level than C or Pascal. However being good at Forth has positive impacts on programming in other languages I believe. Using the Forth approach to writing C leads to shorter functions that are quicker and easier to test and debug. C spoils the programmer while Forth does not. Forth influenced C is more likely to be correct on its first run”.

Code Quality

Compilation of FORTH Words, Immediate Words

CREATE DOES> ;
Resources
  • Expert suggestion how to implement the compilation and immediate words: “You will be challenged by just that and implementing IF THEN and BEGIN AGAIN and DO LOOP. These require immediately executed words to compile branches in your code. Focus on that. Then look at CREATE DOES>”.

Data Formats

BASE
Input of Values with a specific Data Type ( Single Precision Integer, Double Precision Integer, Floats,.. )
Norms
Implementations
    • Jupiter Ace Forth (modified Forth-79) uses the data stack and then has words to process that data, i.e F+ F. etc.
    • “Jupiter Ace 4000 FORTH Programming” by Steven Vickers, page 90
      • “A quirk is the way it works with negative numbers: it adds on 65536 before floaging them... As a consequence of this, the result will never be negative. This explains the 'U' in UFLOAT ( it stands for 'unsigned' )” ⇒ Jupiter FORTH was written in Assembly Language. However, a FORTH implementation in “C” language this happens automatically. A ”-1” is converted to MAX_UINT, e.g. 65.535, 4294967295 :-):
        float fValue = 1.0;
        int iValue = (int)fvalue;
      • “Floating point number takes up the space of two integers”.
Float versus Fixed Integer
Floating Point Arithmetics

Debugging

Implementations
Resources
  • Expert advice: “After the execution of each Forth word you need to call an operation called NEXT which is a routine to advance the interpreter to the next word. You can intercept at this point and take input from the user. Make NEXT into a pointer and change the behaviour of the interpreter any way you like”.

Deferred Words ( DEFER, IS )

Exception Handling

ABORT, ABORT", CATCH, THROW, QUIT
Examples of exception handling for FORTH, in Assembly Language
Homebrewn Exception Handling for FORTH, in C
Homebrewn Exception Handling for FORTH, in C++
Homebrewn Exception Handling for FORTH, in Java
Homebrewn Exception Handling for FORTH, in Lua
Resources
  • Facebook.
      • Supposed you have CATCH and THROW, the error condition will be THROWn, and the surrounding CATCH will handle it. In case that CATCH is in QUIT, an error message will be shown, and the system goes back to QUIT's read-evaluate loop. Have a look at the reference implementation of QUIT here, which shows what a typical standard system will look like. In the current standard, many errors are ambiguous conditions, but we (the Forth standard committee) plan to reduce those when there is a rough consensus on how to implement these”.
      • “I believe the standard Forth way of dealing with errors it to put 'behavior undefined' into the documentation. What I tried to do with DiaperGlu is define behavior for these behavior undefined situations. I also did not use the exception model because of the memory allocation problem. From what I understand, some Forth systems get around this by allocating the memory for exceptions up front and limiting how many there can be. The model I used in DiaperGlu to handle errors is the return code model along with an error stack. The memory for the error stack is allocated during initialization which limits how many errors can be on the stack at once. DiaperGlu Forth words return errors on the error stack, which is often the initial error, like data stack underflow, plus the name of each subroutine which exited due to the error. To use this model in the user's code, the programmer is supposed to check for errors after each word that can cause an error and push its own error code (the name of the routine) and exit. This is a bit of a pain and a lot of typing, but I think I can automate it so it's compiled automatically. The pattern looks like this:”
        : DUPDUP DUP ?ERRORIF NAME>E EXIT THEN DUP ?ERRORIF NAME>E EXIT THEN ;
      • “Exceptions make a code readable, clean and not-error-prone. Without exceptions, if you forget to handle an error code returned by a word, it probably crashes your FORTH or it reports “invalid memory address”, if you handle OS exceptions or signals. Imagine a simple code that reads some data from a file and stores it in a buffer. With exceptions it can look like this:
        : FOO " foo.txt" R/O OPEN >R BUF B/BUF R@ READ #BUF ! R> CLOSE ;
        
        • assuming we have words:
          OPEN ( addr u -- file-id )
          READ ( addr u1 file-id -- u2 )
          CLOSE ( file-id -- )
          BUF \ buffer address
          B/BUF \ length of buffer in bytes
          #BUF \ length of data read from a file
          
        • OPEN, READ and CLOSE raise exception on error. A shortest code without exceptions that I can imagine is something like this:
          : FOO " foo.txt" R/O OPEN-FILE ?ERROR >R BUF B/BUF R@ READ-FILE ?ERROR #BUF ! R> CLOSE-FILE ?ERROR ;
          : ?ERROR ( err# -- ) \ handles error
          ?DUP IF
          ( print error message or do whatever )
          QUIT
          THEN ;
          
        • But try to realize what happens if you forget the word ?ERROR after OPEN-FILE. Despite the fact that you rarely need to solve the error right at the place of its origin. And if you wanted the word FOO to return the error code as well, the result would be quite a mess.
        • With exceptions, if I want to handle an error explicitly at the place of its origin, I can type:
          TRY OPEN ( error handling code )
          or ANS standard way:
          ' OPEN CATCH ( error handling code )
          If I want to explicitly handle the error of the whole word FOO, I can use:
          TRY FOO ( error handling code )
          
        • IMHO returning error codes nowadays is just obsolete”.
      • “When the program exits to the OK prompt, if there are any errors on the stack, it tells the user how many errors, and what to type to see the errors on the stack. This helps a lot with debugging”.
      • “Just use a status value that is returned by each function and based on that status make decisions”.
      • “Call the error handler which by default prints a diagnostic, resets the stacks, and falls back to the outer interpreter loop”.
      • “C++ was originally written as a C preprocessor, and it used setjmp()/longjmp() to implement try/catch”. “When one backs out that way from deep within some nested functions, one might have to clean up anything created on the heap. The stack gets fixed implicitly (that's what setjmp()/longjmp() do well), the heap not so much”.

Functional Programming with FORTH?

Concept for an in-FORTH Help

  • Experts suggested: “Use ' to read the next word and get it's xt. You can then look up info related to it. There's also WORD. It reads the input stream and returns a pointer.

Input / Output - CPU / OS specific Memory I/O & Port I/O Access

Standard FORTH Memory I/O
  • “C@” ( read a 8-bit value / byte / character from a location ).
  • “C!” ( write a 8-bit value / byte / character to a location ).
  • ”@” ( read a number / pointer, e.g. 16/32/64-bit on a 8&16/32/64-bit systems, from a location ).
  • ”!” ( write a number / pointer, e.g. 16/32/64-bit on a 8&16/32/64-bit systems, from a location ).
  • “L@” ( read a 32-bit / 64-bit value / long number / long word, e.g. 32-bit on a 8&16/32-bit systems, but 64-bit on a 64-bit system from a location ).
  • “L!” ( write a 32-bit / 64-bit value / long number / long word, e.g. 32-bit on a 8&16/32-bit systems, but 64-bit on a 64-bit system to a location ).
  • “D@” ( read a 32-bit value / double word from a location ).
  • “D!” ( write a 32-bit value / double word to a location ).
  • Infos:
ESP32forth
  • “C@” ( 8-bit value / byte read access ).
  • “C!” ( 8-bit value / byte write access ).
  • ”@” = “L@” ( 32-bit value / number / pointer read access ).
  • ”!” = “L!” ( 32-bit value / number / pointer write access ).
  • Infos:
    • Forth2020 "Esp32forth" - “ESP32forth digital R/W from a reg. - “You can access any address on the memory space of the ESP32 with the words L@ ( to retrieve a value ) and L! ( to store a value )”.
Open Firmware provides "Register" Access Words meant to access Hardware Devices
  • “RB@”.
  • “RB!”.
Port I/O for Intel 80x86, 808x & Z80
  • “PC@” ( 808X, Z80 and 80×86 ).
  • “PC!” ( 808X, Z80 and 80×86 ).
  • “P@” ( 808X, Z80, 80×86, though 808X and Z80 have just 8-bit I/O address space ).
  • “P!” ( 808X, Z80, 80×86, though 808X and Z80 have just 8-bit I/O address space ).
  • “PL@” ( 80×86 only ).
  • “PL!” ( 80×86 only ).
  • Demo implementation:
    \G Read the 16 bit port x1.
    CODE P@ ( x1 -- x2 ) \ EXTRA "p-fetch"
    MOV DX, BX
    IN AX, DX
    MOV BX, AX
    NEXT
    END-CODE
    
    \G Read the 8 bit port x.
    CODE PC@ ( x -- char ) \ EXTRA "p-c-fetch"
    MOV DX, BX
    IN AL, DX
    $IF386
    MOVZX BX, AL
    $ELSE
    XOR AH, AH
    MOV BX, AX
    $THEN
    NEXT
    END-CODE
    
    \G Write x1 to 16 bit port x2.
    CODE P! ( x1 x2 -- ) \ EXTRA "p-store"
    MOV DX, BX
    POP AX
    OUT DX, AX
    POP BX
    NEXT
    END-CODE
    
    \G Write char to 8 bit port x.
    CODE PC! ( char x -- ) \ EXTRA "p-c-store"
    MOV DX, BX
    POP AX
    OUT DX, AL
    POP BX
    NEXT
    END-CODE
    
  • Infos:
    • Usermanual Wiki "FT-86C and FT-86C/FP USER'S MANUAL" - “3.5 INTERRUPT CONTROL. The 8259A programmable interrupt controller is in local I/O space at Hex address 0008. It can be programmed using the P@ and P! commands in the same manner as the serial communications ports”.
    • Implemented by “CHForth”,...
BoardForth
  • “C@” ( 8-bit value / byte read access ).
  • “C!” ( 8-bit value / byte write access ).
  • “W@” ( 16-bit value / number / pointer read access ).
  • “W!” ( 16-bit value / number / pointer write access ).
  • ”@” ( 32-bit value / number / pointer read access ).
  • ”!” ( 32-bit value / number / pointer write access ).
Memory I/O for highly-optimizing FORTH Compilers to avoid Memory Access Optimisation
  • “P@”.
  • “P!”
  • “PC!”.
  • “PC@”.
  • “PW!”.
  • “PW@”.
  • “PL!”.
  • “PL@”.
  • Infos:
Port I/O, alternative Syntax
  • “IO@”.
  • “IO!”.
Jupiter Ace ( "FORTH Programming" Manual, Chapter 26 )
  • “IN”.
  • “OUT”.
Excursus: Intel 8086 Port I/O with MSDOS C Compilers
Excursus: ARM Memory I/O with C Compilers
Resources

Interoperability

Locals

Loops ( DO ... LOOP )

  • Gforth Manual "5.8.3 Counted Loops" - “Unfortunately, +DO, U+DO, -DO, U-DO and -LOOP are not defined in ANS Forth” :-(. “However, an implementation for these words that uses only standard words is provided in compat/loops.fs” :-).

Mocking for Top-Down Development

  • Mocking example:
    : I_dont_now_jet ;
    variable #I_dont_now_jet ' I_dont_now_jet #I_dont_now_jet !
    : bike #I_dont_now_jet @ execute ;
    : two_bike ." four weels " ;
    : one_bike ." two weels " ;
    : one ' one_bike #I_dont_now_jet ! ;
    : two ' two_bike #I_dont_now_jet ! ;
    one bike
    two bike
    
  • Mocking example #2:
    forth            \ use the vocabulary forth, use ORDER to see the vocabulary stack
    : bicycle ( -- ) \ make a word bicycle, ( -- ) that does not change number of items on the stack
    2 wheels !       \ store 2 in the variable wheels ,but gif a error becose it is not defind jet
    vehicle          \ call the word vehicle, but gif a error because it is not defind jet
    ; so bicycle is not made
    
  • Mocking example #3:
    variable wheels
    5 wheels !
    wheels @ .
    " it works
    
  • Mocking example #4:
    / "variable wheels" is a pointer
    variable wheels
    : vehicle cr ." my vhicle has " wheels @ . ." wheels " ;
    : bicycle 2 wheels ! vehicle ;
    bicycle
    

Multitasking & Interrupt Handling

Concept for Interrupt Handling
  • Experts suggested to have very small interrupt routine, which just sets a flag ( = sets the value of a FORTH variable ), that the Forth engine checks instantly, so that the FORTH engine ( or even FORTH code ) can react on the interrupt.
Multitasking Implementations
Multitasking Documentation
Resources

Object-Orientation, Object-Oriented Forth

Implementations
  • See [forth04.html|FORTH 4/7]]:
    • “ceForth in truely object-oriented C++”
    • “ooForth in Java,”.
    • “Oforth”.
Resources

Stack

Stack Juggling vs. Use of Global Variables & Registers

Stack Juggling
  • Experts told me: Chen-Hanson Ting tends to do more stack juggling. He more of a pure Forth guy and an educator.
Use of Global Variables & Registers
  • Experts told me: Charles “Chuck” Moore is not afraid to use global variables to get around needing excessive stack use. He also uses dedicated registers in colorForth to act as pointers. He is still more like an assembly programmer.

The 3 States - Interpret, Immediate, Compile

  • Experts told me
    • “Topic: Attributes on words. colorForth gets rid of the STATE variable which adds so much confusion and complexity. However he replaces this with the concept of “colours” which I think can be imagined as prefixes. So the same word can be executed as a compile colour or an interpret colour. The complexity of using the right colour for the situation is pushed back to the programmer”.
    • “In classic forth some words can be used in compile mode and interpret mode DUP, + etc, some are compile only IF, THEN or interpret only [IF], [THEN].
    • “POSTPONE ends up being a monstrously complicated word to implement for this reason. It has to handle every permutation. This is the sign that classic Forth has a code smell”.
    • “There is an alternative to the STATE variable, to prefixes or colours. Peter Jackaki talks about ALWAYS compiling. Even his interpreted lines at the REPL are compiled first. He does this by compiling to HERE and throwing away these compiled sequences away afterward running once”.
    • “Immediate words can be understood as 'macros' in other languages. They work at compile time and the STATE variable switches between compile and interpret modes”.
    • “An interpret attribute is not really needed, except as an error check. Some words like IF cannot be used on the command line except inside a definition. The reason why IF THEN have to be immediate is because they compile branches into the definition. These branches mean nothing in an interpreter. [IF] [THEN] can be used instead. These are very slow in comparison, especially [DO][LOOP] because they are interpreted. You can see how these words address kind of like different 'coloured' words. So IF performs no action but compiles a branch. That's why it can be called a macro. (words) generally represent internal words, not meant to be used directly [words] refer to the STATE of Forth. The word [ switches the state to interpret. The word ] switches the state to compile”.
    • “Example: So
      ' DUP EXECUTE

      gets the xt of DUP and executes it. You can't use ' inside a definition. To do that you use”

      ['] : X ['] DUP EXECUTE ;

      . Some Forthers find these words very ugly and annoying. They want Forth to always use the same words”.

    • ”[”: “It gives the definition of : as
      : : CREATE ] ;
    • “This means create a dictionary entry by reading the next word. Then switch into compile mode. The next word is the name. The definition of ; is something like”
      : ; [ ; IMMEDIATE
    • “In other words make the last immediate and turn the compile mode off”.
    • ?? “Not sure about [,] You can do ['] though” ??.
    • “Some experts find camelForth to be the most sane reference. Can't vouch for those weird British micro Forths from the 1980s. They had to cut corners. It's also more standard than eForth. camelForth dates from the 1990s”.
  • HectorForth ( based on figForth ) , CamelForth, Jupiter Forth, the special 'Python for Fun' Forth have ”[”, BBCMicro Forth ( based on FORTH-79 ) does not have ”[” in the vocabulary.

Testing - Provision of a Unit Test Framework

  • As support for the Test-Driven-Development approach ⇒ See FORTH 3/4.

Warnings

Word List by VLIST, WORDS

  • Experts told me:
    • The order of words is usually just the order they appear in the dictionary. The order will reflect the dependencies of one word on another earlier word. They tend to be listed in reverse order of the definition order.
    • Order in the word list only matters for colon words!
    • *The experts suggested to reverse the word list order in the dictionary. “DUP” should appear long before ”.S”, for example.
    • Fast look ups by ordering the word list by Quicksort or other sort algorithm: That will speed up compilation and interpretation. THe search starts at the last word and then walk back over the linked list.
    • “If all words are all C functions it makes no difference. It's only when you read in Forth source code that the order matters.
    • “You need a fast way to parse and compile Forth. Or better, a binary representation of compiled Forth words which is fast to read. Compiled words are extremely simple of course but they need to be linked to tell addresses. To real addresses!”.
    • “Forth programming is like assembly language programming so in systems with memory protection the same rules apply as in assembly language. That usually means that code and data can't share the same memory space. In systems like that an 'execution token' for a primitive routine can't simply represent a memory address just like a data memory address. In those cases the token represents a offset into a table of functions. Colon words on the other hand can be built in normal data memory along with the dictionary”.
  • Experts told me about an important difference beween figFORTH & FORTH-79 on one hand, and FORTH-83 on the other:
    • figFORTH & FORTH-79 have a single wordlist in the form of a tree, created by its implementation is the source code, fixed at compile time. By this, the search order is determined and fixed.
    • With FORTH-83, Bill Ragsdale's “ONLY ALSO” prinziple was implemented: The wordlists are in a list structure. There is a wordlist “ROOT”. By adding different wordlists to the list, the developer may change the search order at runtime.

Appropriate OpenDirectory Directory Pages

 
en/forth05.html.txt · Last modified: 2025/02/15 17:45 (external edit) · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki