Discussion:
Toad using many vocabularies
(too old to reply)
sjack
2024-10-31 15:13:36 UTC
Permalink
A while ago I reworked all my source to use many vocabularies to test
the FigForth vocabulary design. The many vocabularies were not so much
to deal with scoping but to break the dictionary search into many smaller
paths. I found it not to be much of a problem, but a little awearness of
what affects CONTEXT change is essential.

Vocabulary usage demo


..CURRENT and CONTEXT are WRK BASE: 10 Latest: FIO
-VOC \ save original CURRENT
FIO \ set CONTEXT to FIO
open pad/txt/rubyatLI to hdi \ data input handle
create /tmp/foo to hdo \ data output handle
--
-- CONTEXT must be set to FIO in following compile because
-- colon start will reset CONTEXT to CURRENT.
-- After compile, CONTEXT will still be FIO.
--
{ FIO begin mpad0 512 read dup while write repeat 2drop }
{} \ run anonymous
0 close \ close data input
1 close \ close data output
-- See the created file
sys cat /tmp/foo
LI.
The Moving Finger writes; and, having writ,
Moves on: nor all thy Piety nor Wit
Shall lure it back to cancel half a Line,
Nor all thy Tears wash out a Word of it.

--
-- -VOC and +VOC were made to restore CONTEXT to saved CURRENT
-- for the benefit of FORGET MARKER and ANEW which require
-- CURRENT and CONTEXT to be the same.
--
+VOC \ restsore CONTEXT to original CURRENT
{fin} \ release anonymous memory
..CURRENT and CONTEXT are WRK BASE: 10 Latest: FIO
--
-- -VOC saved original CURRENT
-- FIO set CONTEXT for open and create
-- FIO set CONTEXT in anonymous compile { ... }
-- CONTEXT is still FIO for close
-- +VOC restored CONTEXT to original CURRENT
-- ( CURRENT never changed in this example )

-fin-
mhx
2024-10-31 16:30:45 UTC
Permalink
With vocabularies it quickly becomes a pain when,
e.g., defining words in A, switching to B to define
something there, go back to A and define some more
words ... At least it is when trying to FORGET stuff.
(Assuming you *have* FORGET).

It is also messy to write a definition that needs
words from different vocabularies.
( Like a book with footnotes that span multiple pages,
or where a chapter can not be read on its own. )

How did you solve that complexity?

-marcel
sjack
2024-11-01 08:07:37 UTC
Permalink
Post by mhx
(Assuming you *have* FORGET).
I have FORGET as well as MARKER and ANEW . FORGET was kept simple. MARKER
and ANEW will repare voclink chain, prune vocabulary word lists and free
allocated buffers as needed. See job scenario below for ANEW .
Post by mhx
It is also messy to write a definition that needs words from different
vocabularies.
Don't think I've had that joy yet. Most definitions only needed one or two
vocabularies. Perhaps for that application a custom vocabulary which
contains aliases of words from many other needed vocabularies?

Emailed you text to give you background.

I've been operating this way for about a month and a half, some bumps along
the way but no major issues. List below some key points.

i. Vocabularies made immediate
Often vocabulary needed during compile time.
ii. These compile-time context changes do not occur at run-time.
ii. Often a context change at compile-time does not need to be
changed back in the compile if the other words are from a common
vocabulary (e.g. FORTH) linked to either the context or current.

i. Colon start changes CONTEXT to CURRENT
This is very nice. It means many definitions can compile in sequence
to the same CURRENT without regard to what the CONTEXT was left from
the previous compile. Of course the last context change make will need
to be managed.

i. Private vocabularies have little conflict because they are only used by
one parent vocabulary.

i. Use a common vocabulary (MISC) for private words shared among a group of
vocabularies.

i. Support words to save CURRENT to restore CONTEXT is valuable aid.
Typical job scenario:
WRK DEFS \ set context and current to WRK
\ all words defined will reside in WRK
ANEW job \ clear memory for the job
fload job \ load and run job
-- In the job
-VOC \ save CURRENT
( make context changes in or out of definitions as needed )
+VOC \ restore CONTEXT to saved CURRENT
defs \ only needed if CURRENT was changed in the job
-- Out of the job
ANEW JOB \ clear memory for next job

Note that these support words make your job definitions relative. The
job can be loaded in a scratch (WRK) vocabulary and then forgotten or
loaded into a trunk (TOAD) or FORTH vocabulary for expansion.
sjack
2024-11-01 15:00:09 UTC
Permalink
Post by mhx
(Assuming you *have* FORGET).
More correctly NIX doesn't repair a broken voc-link chain but fixes it
so that it's not broken after the dictionary is chopped. If the voc-link
chain becomes broken by some unrelated means, it will stay broken until
fixed by the user.

Overview
i. FORGET is kept simple; it chopes the dictionary but does not prevent
broken voclink chain nor does it purge wordlists of loaded
vocabularies nor does it free allocated buffers that no longer
have links in the chopped dictionary.

i. NIX is the main word for chopping the dictionary and restoring a
valid voclink chain, purging wordlists of remaining vocabularies
and freeing allocated buffers which no longer are linked to the
chopped dictionary.

i. [FORGET] is factor of FORGET called by NIX to chop dictionary

i. -VOCLINK is called by NIX to walk voclink chain to restore it to
a valid start address.

i. -CONTEXT is called by -VOCLINK to to purge wordlists in remaining
vocabularies.

i. -BUF is called by NIX to walk buffer allocation list freeing all
buffers no linked to the chopped dictionary.

i. Marker's compile-time save latest, run-time performs NIX .

i. Anew performs MARKER and creates new marker of same name.
Anton Ertl
2024-11-01 09:48:42 UTC
Permalink
Post by mhx
With vocabularies it quickly becomes a pain when,
e.g., defining words in A, switching to B to define
something there, go back to A and define some more
words ... At least it is when trying to FORGET stuff.
(Assuming you *have* FORGET).
That's not particularly hard in the absence of sections: the forgotten
word has an address F, and FORGET just needs to go through all the
words and check if they are above or below F, and eliminate the words
above F from the data structures associated with the dictionary. You
need some way to enumerate all the words; e.g., in a system with a
linked-list implementation of wordlists, you would need a way to
enumerate every wordlist (e.g., the wordlists themselves could also be
organized as a lined list), and then walk through each linked list
until the first word below F is found, and make that the new head of
the wordlist. Wordlists that themselves have their data above F also
need to be removed, and in that case also from the search order and
from CURRENT.

With sections, FORGET is a bigger problem, because the forgotten word
resides in one section, and provides only information about what is
newer or older in that section. One way to deal with that is to store
the HERE of every section with every word, but do we really want to go
to these lengths in order to support FORGET? MARKER is a better
interface in the presence of sections.

But even MARKER appears to be much more trouble than it is worth,
mainly because I consider its value to be 0. I have never used it in
production code.
Post by mhx
It is also messy to write a definition that needs
words from different vocabularies.
( Like a book with footnotes that span multiple pages,
or where a chapter can not be read on its own. )
The search order provides a way to deal with that, especially if the
names in the wordlists don't conflict. Gforth (development) also
includes a scope recognizer, where you write, say, FOO:BAR, and it
uses the word BAR in vocabulary FOO:

vocabulary foo \ ok
also foo definitions previous \ ok
: bar ." bar" ; \ ok
bar
\ *the terminal*:4:1: error: Undefined word
\ >>>bar<<<
\ ...
foo:bar \ bar ok

Still, I usually find it preferable to have everything in the same
wordlist (FORTH-WORDLIST), which makes debugging easier. I did use a
wordlist in my garbage collector library where the internal words of
the library are defined in a separate wordlist that's not in the
search order in applications using that library. The interface words
of the library are defined in the default wordlist (FORTH-WORDLIST
unless you SET-CURRENT differently before loading the library).

I expect that with a very big program like that by CCS one would find
that the balance shifts towards using wordlists/vocabularies more.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2024: https://euro.theforth.net
Stephen Pelc
2024-11-01 12:06:50 UTC
Permalink
Post by mhx
With vocabularies it quickly becomes a pain when,
e.g., defining words in A, switching to B to define
something there, go back to A and define some more
words ... At least it is when trying to FORGET stuff.
(Assuming you *have* FORGET).
The FORGET issue is mostly a red herring because of compile speed; for
code less than 100k lines we use EMPTY to clean the dictionary and then
start again. 1M lines of code take about 30 seconds to compile.
Post by mhx
It is also messy to write a definition that needs
words from different vocabularies.
( Like a book with footnotes that span multiple pages,
or where a chapter can not be read on its own. )
How did you solve that complexity?
Gerald Wodni implemented the VOC-DOT notation for VFX as a
recogniser. To reference a word in another vocabulary, just use
<voc>.<word>
This notation has proven to be very useful, especially when dealing
with a range of byte-oriented serial devices, e.g:
i2c.emit
spi.emit

The notation also reads well. I have no idea who invented it originally
and where the original source code is.

Stephen
--
Stephen Pelc, ***@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com
MPE website
http://www.vfxforth.com/downloads/VfxCommunity/
downloads
a***@spenarnc.xs4all.nl
2024-11-01 14:04:55 UTC
Permalink
Post by Stephen Pelc
Post by mhx
With vocabularies it quickly becomes a pain when,
e.g., defining words in A, switching to B to define
something there, go back to A and define some more
words ... At least it is when trying to FORGET stuff.
(Assuming you *have* FORGET).
The FORGET issue is mostly a red herring because of compile speed; for
code less than 100k lines we use EMPTY to clean the dictionary and then
start again. 1M lines of code take about 30 seconds to compile.
Forget is a relic from the time you had blocks on mag tape and
you hoped to avoid the rewinding.
Post by Stephen Pelc
Post by mhx
It is also messy to write a definition that needs
words from different vocabularies.
( Like a book with footnotes that span multiple pages,
or where a chapter can not be read on its own. )
How did you solve that complexity?
Gerald Wodni implemented the VOC-DOT notation for VFX as a
recogniser. To reference a word in another vocabulary, just use
<voc>.<word>
This notation has proven to be very useful, especially when dealing
i2c.emit
spi.emit
The notation also reads well. I have no idea who invented it originally
and where the original source code is.
Once the concept of PREFIX sinks in,
it is pretty trivial, (using voc's with builtin ALSO. 1] )

NAMESPACE APE
: APE. APE NAME EVALUATE PREVIOUS ;
( "NAME EVALUATE" is present under various names, i.e. WINTERPRET in noforth)
\ Now define something in the wordlist :
APE DEFINITIONS 12 CONSTANT ORANGUTAN PREVIOUS DEFINITIONS
( back in Forth)
APE. ORANGUTAN .
12 OK
: APE. APE NAME EVALUATE PREVIOUS ; PREFIX
APE.ORANGUTAN .
12 OK
: APE. APE NAME EVALUATE PREVIOUS ; PREFIX IMMEDIATE
: test APE.ORANGUTAN ;
test .
12 OK

All this is defined by facilities present in the simplest of Forth.
Except PREFIX , that cost two or three lines in a kernel. 2]


1] not essential, but I publish only tested code.
2] in a well designed Forth.
Groetjes Albert
--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.
sjack
2024-11-01 15:08:04 UTC
Permalink
FORGET is relic
I still use it for quickies:
0 VALUE FOO ' FOO @ FORGET FOO CONSTANT DOEVAL
0 0 2VALUE FOO ' FOO @ FORGET FOO CONSTANT DOE2VAL
0 0 2CONSTANT FOO ' FOO @ FORGET FOO CONSTANT DOE2CON
a***@spenarnc.xs4all.nl
2024-11-02 09:16:34 UTC
Permalink
FORGET is relic
Puzzling. What the hell is hiding in the first cell of ' FOO ?
Assuming that it is even an address and not a token.


Groetjes Albert
--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.
sjack
2024-11-02 16:05:44 UTC
Permalink
Post by a***@spenarnc.xs4all.nl
Puzzling. What the hell is hiding in the first cell of ' FOO ?
Pointer to code following a DOES> . The pointed-to address can be
used to identify the type of a defined word.
a***@spenarnc.xs4all.nl
2024-11-03 11:39:14 UTC
Permalink
Post by sjack
Post by a***@spenarnc.xs4all.nl
Puzzling. What the hell is hiding in the first cell of ' FOO ?
Pointer to code following a DOES> . The pointed-to address can be
used to identify the type of a defined word.
This ony serves to prove that FORGET is only useful in a specific context
that you are familiar with and probably incompatible with other Forth's.

Groetjes Albert
--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.
Ruvim
2024-11-01 14:28:08 UTC
Permalink
Post by Stephen Pelc
Post by mhx
It is also messy to write a definition that needs
words from different vocabularies.
( Like a book with footnotes that span multiple pages,
or where a chapter can not be read on its own. )
How did you solve that complexity?
Gerald Wodni implemented the VOC-DOT notation for VFX as a
recogniser. To reference a word in another vocabulary, just use
<voc>.<word>
This notation has proven to be very useful, especially when dealing
i2c.emit
spi.emit
The notation also reads well. I have no idea who invented it originally
and where the original source code is.
Such a syntax is used in SP-Forth/4 since 2001, in the form
<voc>::<word> or <voc1>::<voc2>::<word>

Where <voc> is a word that returns wid, or a word that is created with
`vocabulary`.

This syntax in SP-Forth probably came after C++ "::" operator,
introduced in 1998. The same operator was in C# from its initial
release in 2000.

The dot "." operator for accessing nested packages in Java was
introduced in 1995.

In Forth, a dot is often used as part of plain names, so it was less
suitable as a namespace separator.


--
Ruvim
Anthony Howe
2024-11-01 15:08:20 UTC
Permalink
Such a syntax is used in SP-Forth/4 since 2001, in the form <voc>::<word> or
<voc1>::<voc2>::<word>
Where <voc> is a word that returns wid, or a word that is created with
`vocabulary`.
This syntax in SP-Forth probably came after C++ "::" operator, introduced in
1998.  The same operator was in C# from its initial release in 2000.
The dot "." operator for accessing nested packages in Java was introduced in 1995.
In Forth, a dot is often used as part of plain names, so it was less suitable as
a namespace separator.
Yep. Bad idea to use dot.

Never liked `::`, found it visually distasteful.

Dare I suggest C's `->` which trumps others by decades.

Or simply create a new one like `:>` or `~` (oh I'll be stoned for this one).
I would prefer a single character, but there are so few good choices.
--
Anthony C Howe
***@snert.com BarricadeMX & Milters
http://nanozen.snert.com/ http://software.snert.com/
Ruvim
2024-11-01 17:35:39 UTC
Permalink
Post by Ruvim
Such a syntax is used in SP-Forth/4 since 2001, in the form
<voc>::<word> or <voc1>::<voc2>::<word>
Where <voc> is a word that returns wid, or a word that is created with
`vocabulary`.
This syntax in SP-Forth probably came after C++ "::" operator,
introduced in 1998.  The same operator was in C# from its initial
release in 2000.
The dot "." operator for accessing nested packages in Java was introduced in 1995.
In Forth, a dot is often used as part of plain names, so it was less
suitable as a namespace separator.
Yep.  Bad idea to use dot.
Do you mean to use dot in names or to use dot as a namespace separator
in Forth?
Never liked `::`, found it visually distasteful.
Agreed. But there no very good choices, as you mentioned below ))
Dare I suggest C's `->` which trumps others by decades.
BTW, as far as I understand, the `->` operator in C is not about
namespaces, but about dereferencing a pointer and accessing a structure
member.

So, `ptr->field` is a shorthand for `(*ptr).field`

That is, `ptr` is not a namespace. The namespace (in which the name
`field` is resolved) is inferred from the *data type* of `ptr`.
Or simply create a new one like `:>` or  `~` (oh I'll be stoned for this
one). I would prefer a single character, but there are so few good choices.
At the moment, I like the XML namespaces syntax the best.

The idea is that we associate a long namespace identifier with a short
prefix, which is only valid within its lexical scope.

The syntax is: <prefix>:<name>

Since a prefix is ​​valid in a limited scope, it can be short without
risk of conflicts/clashes. Typically, only a few sibling prefixes are
used in a module.

So, a name is qualified by its namespace using only one short prefix,
regardless how deep this namespace is nested.

Of course, there can be some predefined prefixes (which can still be
shadowed).


For example, I want to use a module:
<https://github.com/ForthHub/fep-recognizer/blob/master/implementation/lib/string-match.fth>

There can be a special mapping `github` to treat modules from GitHub.

At an API level it could be like this:

"github:ForthHub/fep-recognizer/implementation/lib/string-match.fth"
"str" module:push-prefix

"foobar" "foo" str:starts-with . \ should print -1

module:drop-prefix


Under the hood, the system should download the package
"github:ForthHub/fep-recognizer" (if it hasn't already cached), then the
module "implementation/lib/string-match.fth" from this package should be
instantiated in memory it its own word list (if it hasn't already), then
the prefix "str" is associated with the word list of this module.

The lexeme "str:starts-with" is processed by the recognizer of prefixes,
which extracts the prefix "str", obtains the corresponding word list,
finds "starts-with" in that word list, and returns the name token for
the word and the name token translator, as `( nt tt-nt )`. The Forth
text interpreter executes `tt-nt` (which is an xt) to perform the
compilation semantics or interpretation semantics for the word according
to the current state.


--
Ruvim
Ruvim
2024-11-01 15:14:45 UTC
Permalink
Post by Ruvim
Post by Stephen Pelc
Post by mhx
It is also messy to write a definition that needs
words from different vocabularies.
( Like a book with footnotes that span multiple pages,
or where a chapter can not be read on its own. )
How did you solve that complexity?
Gerald Wodni implemented the VOC-DOT notation for VFX as a
recogniser. To reference a word in another vocabulary, just use
   <voc>.<word>
This notation has proven to be very useful, especially when dealing
   i2c.emit
   spi.emit
The notation also reads well. I have no idea who invented it originally
and where the original source code is.
Such a syntax is used in SP-Forth/4 since 2001, in the form
<voc>::<word> or <voc1>::<voc2>::<word>
Where <voc> is a word that returns wid, or a word that is created with
`vocabulary`.
This syntax in SP-Forth probably came after C++ "::" operator,
introduced in 1998.  The same operator was in C# from its initial
release in 2000.
The dot "." operator for accessing nested packages in Java was
introduced in 1995.
In Forth, a dot is often used as part of plain names, so it was less
suitable as a namespace separator.
Another piece of history.

In Tcl, the sequence "::" for accessing namespaces was introduced in
1997[1].

In Erlang, the sequence ":" (sic one colon) for accessing namespaces
(which are essentially modules) was since its initial release in 1995.


[1] <http://tcl.tk/software/tcltk/8.0.html#incompatibilities>


--
Ruvim
sjack
2024-11-01 15:16:09 UTC
Permalink
Post by Ruvim
In Forth, a dot is often used as part of plain names, so it was less
suitable as a namespace separator.
In FigForth ID. prints word name. In Toad plans are to change printing
words to likewise have the dot at the word name end; dot at the
beginning of name to suggest an executable that has a reference.
Loading...