Discussion:
Single-pass or multi-pass Forth compiler
(too old to reply)
minf...@arcor.de
2023-01-16 11:50:26 UTC
Permalink
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.

OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
- elimination of comments
- resolving of labels for jump targets
- intermediate language e.g. program as database of instruction lists
- peephole optimization, superinstructions
- tail call elimination

Is there any Forth out there that compiles high-level definitions in this way?
Anybody willing to share some other experience in this field?
Jan Coombs
2023-01-16 14:26:59 UTC
Permalink
On Mon, 16 Jan 2023 03:50:26 -0800 (PST)
Post by ***@arcor.de
- intermediate language e.g. program as database of instruction lists
Maybe this one in 'Holon Forth' and 'Fifth':

Planet Holonforth Wolf Wejgaard EuroForth 2016
http://www.euroforth.org/ef16/papers/wejgaard.pdf

1989-09-00 Paper presented at the EuroFORML conference 1989
Not Screens nor Files but Words Wolf Wejgaard
https://holonforth.com/ef89.html

Further papers at:
https://www.holonforth.com/papers.html

Ref to Fifth was in:
A Fresh Look at the Forth Dictionary
Wolf Wejgaard
Presented at the EuroFORTH conference 1995
https://www.holonforth.com/ef95.html

Jan Coombs
--
minf...@arcor.de
2023-01-16 19:29:33 UTC
Permalink
Post by Jan Coombs
On Mon, 16 Jan 2023 03:50:26 -0800 (PST)
Post by ***@arcor.de
- intermediate language e.g. program as database of instruction lists
Planet Holonforth Wolf Wejgaard EuroForth 2016
http://www.euroforth.org/ef16/papers/wejgaard.pdf
1989-09-00 Paper presented at the EuroFORML conference 1989
Not Screens nor Files but Words Wolf Wejgaard
https://holonforth.com/ef89.html
https://www.holonforth.com/papers.html
A Fresh Look at the Forth Dictionary
Wolf Wejgaard
Presented at the EuroFORTH conference 1995
https://www.holonforth.com/ef95.html
I seem to remember Holon running on DOS. Last time I looked I got the impression
that it was a nice mixture of Tcl and Forth, with most fun parts included from Tcl libraries.
Good work, not criticizing, but running an interpreter through another interpreter
seemed rather heavy to me.

Nevertheless the "modules/groups/dictionary/words in a database" is an appealing
concept.
Anton Ertl
2023-01-16 16:27:17 UTC
Permalink
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
- elimination of comments
- resolving of labels for jump targets
- intermediate language e.g. program as database of instruction lists
- peephole optimization, superinstructions
- tail call elimination
Is there any Forth out there that compiles high-level definitions in this way?
Anybody willing to share some other experience in this field?
Passes in the original sense have gone out of fashion once RAM sizes
were big enough. Compilers often have intermediate representations
for purposes such as analysis and optimization, and for modularization
purposes.

iForth and VFX have an intermediate representation that they use for
inlining. AFAIK they don't use it for elimination of comments, jump
target resolution, peephole optimization, superinstructions, or
tail-call elimination.

Gforth uses the threaded-code representation (and its relocatable
variant) either directly, or as a step to a mixture of native code and
threaded-code. It also buffers the threaded code for a straight-line
piece of code before generating native code, to allow better code
generation.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net
minf...@arcor.de
2023-01-17 08:23:41 UTC
Permalink
Post by Anton Ertl
iForth and VFX have an intermediate representation that they use for
inlining. AFAIK they don't use it for elimination of comments, jump
target resolution, peephole optimization, superinstructions, or
tail-call elimination.
Gforth uses the threaded-code representation (and its relocatable
variant) either directly, or as a step to a mixture of native code and
threaded-code. It also buffers the threaded code for a straight-line
piece of code before generating native code, to allow better code
generation.
The VFX docs speak of code inlining vs source inlining, the latter now
improved through tokenizing. This looks similar to gforth's inlining method.
Anton Ertl
2023-01-17 09:08:14 UTC
Permalink
Post by ***@arcor.de
Post by Anton Ertl
iForth and VFX have an intermediate representation that they use for
inlining. AFAIK they don't use it for elimination of comments, jump
target resolution, peephole optimization, superinstructions, or
tail-call elimination.
Gforth uses the threaded-code representation (and its relocatable
variant) either directly, or as a step to a mixture of native code and
threaded-code. It also buffers the threaded code for a straight-line
piece of code before generating native code, to allow better code
generation.
The VFX docs speak of code inlining vs source inlining, the latter now
improved through tokenizing.
The original source inliner of VFX had the same correctness problem as
EVALUATE-based macros. AFAIK The "tokenizing" resolves the names,
BASE etc. when the code is tokenized and therefore does not have this
problem.
Post by ***@arcor.de
This looks similar to gforth's inlining method.
Gforth currently does not inline colon definitions. The native-code
approach has been called "selective inlining" (of primitives) by
Piumarta, and is probably similar to what VFX documents as "code
inlining", but I normally don't call it inlining.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net
minf...@arcor.de
2023-01-17 10:01:41 UTC
Permalink
Post by Anton Ertl
Post by ***@arcor.de
Post by Anton Ertl
iForth and VFX have an intermediate representation that they use for
inlining. AFAIK they don't use it for elimination of comments, jump
target resolution, peephole optimization, superinstructions, or
tail-call elimination.
Gforth uses the threaded-code representation (and its relocatable
variant) either directly, or as a step to a mixture of native code and
threaded-code. It also buffers the threaded code for a straight-line
piece of code before generating native code, to allow better code
generation.
The VFX docs speak of code inlining vs source inlining, the latter now
improved through tokenizing.
The original source inliner of VFX had the same correctness problem as
EVALUATE-based macros. AFAIK The "tokenizing" resolves the names,
BASE etc. when the code is tokenized and therefore does not have this
problem.
Macro correctness is an interesting topic as it can depend on changed
machine states. Off topic, but correct selection of what has to be included
in saved closure enviroments falls into the same problem category.

I wonder if the Holon approach (keeping everything in a realtime source
database) did not suffer from the same dilemma.
Hans Bezemer
2023-01-16 16:43:06 UTC
Permalink
Post by ***@arcor.de
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
- elimination of comments
- resolving of labels for jump targets
Again - I can't think of a Forth compiler that doesn't. References are placed on
some stack and used as target addresses when jumping back or as source
addresses to be patched later when jumping forward.
Post by ***@arcor.de
- peephole optimization, superinstructions
4tH has a peephole optimizer used for strength reduction, constant folding and
dead code elimination. It only looks back a single instruction, but can in some
circumstances work recursive.
Post by ***@arcor.de
- tail call elimination
4tH's optimizer does that one as well.
Post by ***@arcor.de
Is there any Forth out there that compiles high-level definitions in this way?
Anybody willing to share some other experience in this field?
It works fine and transparent - unless you insist to meddle with the return stack
in non-portable ways.

https://sourceforge.net/p/forth-4th/wiki/Optimization/

Hans Bezemer
Marcel Hendrix
2023-01-16 17:39:28 UTC
Permalink
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
[..]
Post by ***@arcor.de
Is there any Forth out there that compiles high-level definitions in this way?
Using an intermediate representation is far more powerful, as one can
do recursive 'context-sensitive' inlining (delay compiling as long as possible).

In iForth's iSPICE package I'll eventually use run-time code generation (e.g.
to eliminate pointer-chasing). This delays instruction selection to the limit.
However, it only works if the target application is sufficiently unsophisticated,
or has predictable execution behavior (like SPICE). I'm sure that there would
be many snags would I try it for Forth itself.

-marcel
minf...@arcor.de
2023-01-16 19:47:13 UTC
Permalink
Post by Marcel Hendrix
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
[..]
Post by ***@arcor.de
Is there any Forth out there that compiles high-level definitions in this way?
Using an intermediate representation is far more powerful, as one can
do recursive 'context-sensitive' inlining (delay compiling as long as possible).
In iForth's iSPICE package I'll eventually use run-time code generation (e.g.
to eliminate pointer-chasing). This delays instruction selection to the limit.
However, it only works if the target application is sufficiently unsophisticated,
or has predictable execution behavior (like SPICE). I'm sure that there would
be many snags would I try it for Forth itself.
Interesting. I never considered lazy evaluation for Forth.

Without climbing up to Haskell's heights, there are simple "lazy evaluators in Python
(copied from Stackoverflow):
######
In a nutshell, lazy evaluation means that the object is evaluated when it is needed,
not when it is created.
In Python 2, range will return a list - this means that if you give it a large number,
Post by Marcel Hendrix
Post by ***@arcor.de
i = range(100)
type(i)
<type 'list'>
Post by Marcel Hendrix
Post by ***@arcor.de
i = range(100)
type(i)
<class 'range'>
Only when you consume it, will it actually be evaluated - in other words, it will only return
the numbers in the range when you actually need them.
######

As knee reflex ISTM to have many similarities with returning macros. It would be
quite easy in Forth to return xt's of words containing EVALUATEs or [[ ]] gforthisms
for delayed evaluation.
none) (albert
2023-01-17 07:36:02 UTC
Permalink
Post by Marcel Hendrix
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
[..]
Post by ***@arcor.de
Is there any Forth out there that compiles high-level definitions in this way?
Using an intermediate representation is far more powerful, as one can
do recursive 'context-sensitive' inlining (delay compiling as long as possible).
In iForth's iSPICE package I'll eventually use run-time code generation (e.g.
to eliminate pointer-chasing). This delays instruction selection to the limit.
What is "pointer chasing"?
Post by Marcel Hendrix
-marcel
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -
Gerry Jackson
2023-01-16 20:29:23 UTC
Permalink
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
- elimination of comments
- resolving of labels for jump targets
- intermediate language e.g. program as database of instruction lists
- peephole optimization, superinstructions
- tail call elimination
Is there any Forth out there that compiles high-level definitions in this way?
Anybody willing to share some other experience in this field?
My sytem, that has never been released (and won't be), which was
developed from about 2005, compiles Forth code to an intermediate form,
carries out some optimisations, then generates the executable code. I
did it this way (I think) to:
- widen the range of optimisations beyond simple peephole optimisation
- making it more readily portable to different processors and OS's (one
front end, n back ends).
- generating different types of executable code including machine code.

and probably other reasons that I've forgotten.

I did peephole optimisations and super-instructions but then found other
things took priority e.g. using the system, so I've never got round to
getting on with other aims.

It did make things such as quotations easier as it just involves
suspending the outer definition, compiling the quotation, then resuming
the suspended definition. Nested as much as desired.

It enabled me to experiment with other capabilities such as nested colon
definitions, declaration of variables inside a definition, different
entry points, not that I've used them for real.
--
Gerry
minf...@arcor.de
2023-01-17 08:11:31 UTC
Permalink
Post by Gerry Jackson
My sytem, that has never been released (and won't be), which was
developed from about 2005, compiles Forth code to an intermediate form,
carries out some optimisations, then generates the executable code. I
- widen the range of optimisations beyond simple peephole optimisation
- making it more readily portable to different processors and OS's (one
front end, n back ends).
- generating different types of executable code including machine code.
and probably other reasons that I've forgotten.
I did peephole optimisations and super-instructions but then found other
things took priority e.g. using the system, so I've never got round to
getting on with other aims.
It did make things such as quotations easier as it just involves
suspending the outer definition, compiling the quotation, then resuming
the suspended definition. Nested as much as desired.
So it compiles the innermost definitions/quotations first before compiling
outer definitions?

The "classic" way is to jump over quotations, nesting included.
From the reference test:
T{ : q1 [: 1 ;] ; q1 EXECUTE -> 1 }T
T{ : q2 [: [: 2 ;] ;] ; q2 EXECUTE EXECUTE -> 2 }T

Did you experiment with enclosures?
Gerry Jackson
2023-01-17 10:47:26 UTC
Permalink
Post by ***@arcor.de
Post by Gerry Jackson
My sytem, that has never been released (and won't be), which was
developed from about 2005, compiles Forth code to an intermediate form,
carries out some optimisations, then generates the executable code. I
- widen the range of optimisations beyond simple peephole optimisation
- making it more readily portable to different processors and OS's (one
front end, n back ends).
- generating different types of executable code including machine code.
and probably other reasons that I've forgotten.
I did peephole optimisations and super-instructions but then found other
things took priority e.g. using the system, so I've never got round to
getting on with other aims.
It did make things such as quotations easier as it just involves
suspending the outer definition, compiling the quotation, then resuming
the suspended definition. Nested as much as desired.
So it compiles the innermost definitions/quotations first before compiling
outer definitions?
The "classic" way is to jump over quotations, nesting included.
Yes a crude, but effective, solution imposed by the traditional single
pass compilation.
Post by ***@arcor.de
T{ : q1 [: 1 ;] ; q1 EXECUTE -> 1 }T
T{ : q2 [: [: 2 ;] ;] ; q2 EXECUTE EXECUTE -> 2 }T
Did you experiment with enclosures?
Not with the built in quotations. But before quotations were
standardised I did develop an ANS Forth implementation of closures in
2011 that I can let you have if you are interested.
--
Gerry
minf...@arcor.de
2023-01-17 12:18:57 UTC
Permalink
Post by Gerry Jackson
Post by ***@arcor.de
Post by Gerry Jackson
My sytem, that has never been released (and won't be), which was
developed from about 2005, compiles Forth code to an intermediate form,
carries out some optimisations, then generates the executable code. I
- widen the range of optimisations beyond simple peephole optimisation
- making it more readily portable to different processors and OS's (one
front end, n back ends).
- generating different types of executable code including machine code.
and probably other reasons that I've forgotten.
I did peephole optimisations and super-instructions but then found other
things took priority e.g. using the system, so I've never got round to
getting on with other aims.
It did make things such as quotations easier as it just involves
suspending the outer definition, compiling the quotation, then resuming
the suspended definition. Nested as much as desired.
So it compiles the innermost definitions/quotations first before compiling
outer definitions?
The "classic" way is to jump over quotations, nesting included.
Yes a crude, but effective, solution imposed by the traditional single
pass compilation.
Post by ***@arcor.de
T{ : q1 [: 1 ;] ; q1 EXECUTE -> 1 }T
T{ : q2 [: [: 2 ;] ;] ; q2 EXECUTE EXECUTE -> 2 }T
Did you experiment with enclosures?
Not with the built in quotations. But before quotations were
standardised I did develop an ANS Forth implementation of closures in
2011 that I can let you have if you are interested.
Thank you for your kind offer! I just sent you a private mail.
Lars Brinkhoff
2023-01-17 06:39:48 UTC
Permalink
Leeloo says: multipass.
none) (albert
2023-01-17 07:50:42 UTC
Permalink
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
OTOH by going over the code in several passes offers easy paths for
simplification and optimization. e.g.
- elimination of comments
- resolving of labels for jump targets
- intermediate language e.g. program as database of instruction lists
- peephole optimization, superinstructions
- tail call elimination
My optimiser goes through the steps (experimental)

1. find stack effect for all primitive words (through a sophisticated
assembler analysis) ( + stack effects properties)
2. find stack effects for other words, ( + stack effects properties)
3. optimise words from the bottom up,high level, a generalisation of
constant folding
4. inline the machine code. A number of blocks that jumps to each
other results
5. optimise the blocks, individually and as a whole.

It results (in some examples) to a speed up comparably to gforth
versus mpeforth.

Highly experimental.
Post by ***@arcor.de
Is there any Forth out there that compiles high-level definitions in this way?
Anybody willing to share some other experience in this field?
Go to my site below. forthlectures.html View lecture 5.
(This has been announced several times.)

Tail call elimination?
If you're programming for speed, you probably avoid tail calls.
(Not that it is hard to do.)

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -
dxforth
2023-01-17 10:23:54 UTC
Permalink
Post by ***@arcor.de
Historically Forth features a single-pass compiler that is triggered e.g.
when the interpreter hits a : colon. Some control flows require code
back-patching to resolve jump targets. This can be awkward.
m/c assemblers went from two-pass to one-pass. Trust forth to want to go
in the opposite direction :)
Matthias Koch
2023-01-17 15:06:05 UTC
Permalink
I found multiple passes necessary in Mecrisp-Across to re-join different control flow branches that use a different local register allocation without relying on canonical fallback. If the number of stack elements held in registers at the joint differ, one needs to use the smaller number of stack elements, which may require additional passes to achieve.

Using the larger number of elements causes non-existing elements to be popped from the stack, at least temporarily. Of course, in valid Forth code, these elements will be pushed back, but may cause reads/writes beyond the beginning of the stack area in memory. Simply adding a guard space at the beginning of the stack would be a solution to keep it single-pass, but in combination with nested interrupts, this may be riscy. I solved this with multiple passes in the compiler, and generate elegant code along the way.

Note this compiler is capable of register allocation across ?dup and similiar constructs.
minf...@arcor.de
2023-01-18 09:54:03 UTC
Permalink
Post by Matthias Koch
I found multiple passes necessary in Mecrisp-Across to re-join different control flow branches that use a different local register allocation without relying on canonical fallback. If the number of stack elements held in registers at the joint differ, one needs to use the smaller number of stack elements, which may require additional passes to achieve.
Using the larger number of elements causes non-existing elements to be popped from the stack, at least temporarily. Of course, in valid Forth code, these elements will be pushed back, but may cause reads/writes beyond the beginning of the stack area in memory. Simply adding a guard space at the beginning of the stack would be a solution to keep it single-pass, but in combination with nested interrupts, this may be riscy. I solved this with multiple passes in the compiler, and generate elegant code along the way.
Note this compiler is capable of register allocation across ?dup and similiar constructs.
Looks like some 'spill and iterate' algorithm. Did you consider spilling costs?
I imagine that moves between registers and memory take MUCH more time
than some little register reshuffling at control flow joints.
Matthias Koch
2023-01-18 10:16:18 UTC
Permalink
Can you point me to a text describing the solution or existing implementation? I am reshuffling registers as much as possible without spilling them to the stack, but I do not know on how to handle the situation without partial spilling (and subsequent reshuffling of the common numer of elements in registers) when I have a larger total amount of stack elements in registers than in the other branch, as different branches may have different depth stack effects, and these elements do not exist on the stack at all for one of the two branches.
minf...@arcor.de
2023-01-18 11:17:44 UTC
Permalink
Post by Matthias Koch
Can you point me to a text describing the solution or existing implementation? I am reshuffling registers as much as possible without spilling them to the stack, but I do not know on how to handle the situation without partial spilling (and subsequent reshuffling of the common numer of elements in registers) when I have a larger total amount of stack elements in registers than in the other branch, as different branches may have different depth stack effects, and these elements do not exist on the stack at all for one of the two branches.
I am certainly much less experienced than you are in this field. I just happen to know
that Chaitin's algortihm includes spilling until new liveliness analysis comes up with
an acceptably colored graph. But AFAIK optimal spilling is still a thing for heuristics
or brute force iteration. Until today there is no known 'formula' for it.

Hence my question out of curiosity whether you have gained new insights. It had not
been clear whether you do liveliness analysis at all.

Add to this that spilling costs are not constant, given the caching effects in many CPUs.
I found only one lecture note that at least mentions this (but does not deal with it later on):
https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/lectures/17/Slides17.pdf

Of course it remains also a design decision of where for a small Forth compiler for
embedded systems reasonable usability ends and where overkill starts.
minf...@arcor.de
2023-01-18 11:46:59 UTC
Permalink
Post by ***@arcor.de
Post by Matthias Koch
Can you point me to a text describing the solution or existing implementation? I am reshuffling registers as much as possible without spilling them to the stack, but I do not know on how to handle the situation without partial spilling (and subsequent reshuffling of the common numer of elements in registers) when I have a larger total amount of stack elements in registers than in the other branch, as different branches may have different depth stack effects, and these elements do not exist on the stack at all for one of the two branches.
I am certainly much less experienced than you are in this field. I just happen to know
that Chaitin's algortihm includes spilling until new liveliness analysis comes up with
an acceptably colored graph. But AFAIK optimal spilling is still a thing for heuristics
or brute force iteration. Until today there is no known 'formula' for it.
Hence my question out of curiosity whether you have gained new insights. It had not
been clear whether you do liveliness analysis at all.
Add to this that spilling costs are not constant, given the caching effects in many CPUs.
https://web.stanford.edu/class/archive/cs/cs143/cs143.1128/lectures/17/Slides17.pdf
Of course it remains also a design decision of where for a small Forth compiler for
embedded systems reasonable usability ends and where overkill starts.
P.S. since you won't do graph coloring, perhaps this paper may be of more interest because
it also treats with data flow analysis:
http://www.christianwimmer.at/Publications/Wimmer10a/Wimmer10a.pdf
Matthias Koch
2023-01-18 12:18:05 UTC
Permalink
Post by ***@arcor.de
P.S. since you won't do graph coloring, perhaps this paper may be of more interest because
http://www.christianwimmer.at/Publications/Wimmer10a/Wimmer10a.pdf
Thanks, looks nice!
By the way, I do not even use SSA intermediate form.

Matthias Koch
2023-01-18 12:10:52 UTC
Permalink
These slides are a very nice intro to register allocation difficulties - but register spilling therein happens when "we have more elements than registers", whereas in Mecrisp-Across I have enough registers for most of the reasonable code. For example, 2OVER gets 6 deep, so for an embedded MSP430 Forth, it is OK to start spilling at 10 elements deep because of lack of registers. The variable liveness analysis is a different kind of beast than a stack. I do not try to perform liveness analysis at all.

What I do is: When stack elements are handled first time, these are popped from the stack(s) and put in registers, and the compiler tries to keep these in registers unless forced to get back to canonical state by non-optimiseable structures.

This is done in Mecrisp-Stellaris RA (data stack only) and Mecrisp-Quintus with the loadable Acrobatics compiler extension (both data and return stack) which is written in Forth itself, my recommendation if you want a starting point. These do canonical fallback when encountering control structures.

Mecrisp-Across is one step ahead and keeps the current set of registers when control flow forks execution, and tries to re-merge the two resulting register sets at the joint, if possible by reshuffling one of the branches to get back to a common register set. When not, it spills the additional elements from the larger branch (which may require an additional pass in my compiler), and reshuffles the remaining ones. This is not done because of lack of registers, but because of the nature of the stacks and the limited scope of the compiler. At the end of a definition, or before an unknown immediate word is executed, a full canonical fallback is done.

A curious mind one out there wrote a nice blog post analysing the output of the Mecrisp-Across compiler:

https://joldosh.blogspot.com/2019/12/optimized-msp430-assembly-in-mecrisp.html

Certainly there is no caching, as I am targeting tiny classic MSP430 chips like the MSP430F2012 with Mecrisp-Across, therefore register spills are the same cost under all circumstances.

Caching effects apply when running Mecrisp-Stellaris or Mecrisp-Quintus hosted on ARM/RISC-V/MIPS Linux (and thanks to Robert Clausecker FreeBSD), but the family of Mecrisp Forths is designed for deep embedded and the hosted ports are not the main focus of my efforts.
Loading...