Differentiable Forth

[Please do not mail me a copy of your followup]

Post by mhx
Unfortunately, this wasn't about what I thought it would be about.
Nevertheless, an interesting read: https://arxiv.org/pdf/1605.06640v1 .

Just out of curiosity, what did you think it would be about?

--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

mhx

2024-07-16 18:10:01 UTC

Some programming languages (Julia, .. ) claim to be able
to generate the derivative of *any* numeric computation
with a cool trick (delta in parallel variable, accessed
like COMPLEX). I'd like to experiment with such a feature.

-marcel

legalize+ (Richard)

2024-07-16 22:18:04 UTC

[Please do not mail me a copy of your followup]

Post by mhx
Some programming languages (Julia, .. ) claim to be able
to generate the derivative of *any* numeric computation
with a cool trick (delta in parallel variable, accessed
like COMPLEX). I'd like to experiment with such a feature.

Have you looked at slang? <https://shader-slang.com/>

-- Richard

mhx

2024-07-16 23:00:28 UTC

Post by legalize+ (Richard)
Have you looked at slang? <https://shader-slang.com/>

It's the wrong type of differentiation again. AI-people
are overloading our vocabulary, redefining our words,
and overflowing our stacks.

-marcel

Paul Rubin

2024-07-16 23:12:50 UTC

It's the wrong type of differentiation again. AI-people are
overloading our vocabulary, redefining our words, and overflowing our
stacks.

The automatic differentiation there looks like what you were talking
about, built into the language sort of like some languages have built in
complex numbers.

Here is a 2005 article about AD that you might like:

http://blog.sigfpe.com/2005/07/automatic-differentiation.html

It mentions at the end that you can approximate this method using
complex numbers. Just pretend that the imaginary part is nilpotent, I
guess.

minforth

2024-07-17 02:59:10 UTC

AD is a powerful method when you need the derivatives of calculated
(not measured) functions. But if you don't know what a Jacobian
matrix is, you probably won't need AD.

There is a whole website dedicated to AD, tools and applications:
https://www.autodiff.org/

Projected to Forth, an autodiff wordset could be thought of similar
to a complex number wordset imaging the standard fp-number wordset.
However, care must be taken with non-differentiable regions in
e.g. fabs, fsqrt and some (inverse) trigonometric functions.
(AD libraries in C++ use operator overloading)

mhx

2024-07-17 05:12:49 UTC

Post by minforth
https://www.autodiff.org/

Useful!

Post by minforth
Projected to Forth, an autodiff wordset could be thought of similar
to a complex number wordset imaging the standard fp-number wordset.

That is just what I hoped to find. The Julia blurb did not mention
complex number similarities (probably I should've dug deeper).

Post by minforth
However, care must be taken with non-differentiable regions in
e.g. fabs, fsqrt and some (inverse) trigonometric functions.
(AD libraries in C++ use operator overloading)

Exactly the problem I face when applying state-space modeling to
power electronics combined with digital control. It can be
handled relatively easily because everything already revolves
around knowing when such discontinuities happen.

-marcel

minforth

2024-07-17 06:13:16 UTC

See intro in chapter 1.6:
https://github.com/coin-or/ADOL-C/blob/master/ADOL-C/doc/adolc-manual.pdf

Spice state space modelling with AD in chapter 3.2ff:
https://vision.lakeheadu.ca/publications/freeda_proc.pdf

legalize+ (Richard)

2024-07-17 21:16:12 UTC

[Please do not mail me a copy of your followup]

Post by legalize+ (Richard)
Have you looked at slang? <https://shader-slang.com/>

It's the wrong type of differentiation again. AI-people
are overloading our vocabulary, redefining our words,
and overflowing our stacks.

Well, it's "differentiable" in the mathematical sense so that
gradients can be computed to drive fitness algorithms.

In what sense did you mean "differentiable"?

ahmed

2024-07-17 15:26:54 UTC

Hi,
Here is a program that gives examples of using dual numbers to calculate
derivatives
(it is not optimized,..., just a proof of concept)

The code begins here

: -frot frot frot ;
: f>dl ( f: a -- a 0) 0e ;
: dl>rl ( f: a b -- a) fdrop ;
: dl>eps ( f: a b -- b) fnip ;

: dl. ( f: a b -- ) fswap f. ." + eps " f. ;
: dl(,). ( f: a b -- ) ." (" fswap f. ." , " f. ." )" ;

: dl+ ( f: a b c d -- a+c b+d)
frot ( f: a c d b)
f+ ( f: a c b+d)
-frot ( f: b+d a c)
f+ ( f: b+d a+c)
fswap ( f: a+c b+d) ;

: dl- ( f: a b c d -- a-c b-d)
frot ( f: a c d b)
fswap f- ( f: a c b-d)
-frot ( f: b-d a c)
f- ( f: b-d a-c)
fswap ( f: a-c b-d) ;

fvariable b*c
: dl* ( f: a b c d -- a*c a*d+b*c)
-frot ( f: a d b c)
ftuck f* ( f: a d c b*c)
b*c f! ( f: a d c)
frot ( f: d c a)
ftuck ( f: d a c a)
f* ( f: d a a*c)
-frot ( f: a*c d a)
f* b*c f@ f+ ( f: a*c a*d+b*c)
;

: 1/dl ( f: a b -- 1/a -b*1/a^2)
fswap 1/f ( f: b 1/a)
ftuck ( f: 1/a b 1/a)
fdup f* ( f: 1/a b 1/a^2)
f* fnegate ( f: 1/a -b/a^2)
;

: dl/ ( f: a b c d -- a/c rest)
1/dl dl*
;

: dl^f ( f: a b c -- a^c b*c)
ftuck ( f: a c b c)
f* -frot ( f: b*c a c)
f** fswap ( f: a^c b*c)
;

\
: dldup ( f: a b -- a b a b) fover fover ;
: dlnip ( f: a b c d -- c d) frot fdrop frot fdrop ;
: dldrop ( f: a b -- ) fdrop fdrop ;

fvariable dlswap_temp
: dlswap ( f: a b c d -- c d a b)
frot dlswap_temp f! ( f: a c d)
frot ( f: c d a)
dlswap_temp f@ ;

fvariable dlover_temp1
fvariable dlover_temp2
: dlover ( f: a b c d -- a b c d a b)
dlswap ( f: a b c d -- c d a b)
dlover_temp2 f! dlover_temp1 f! ( f: c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: c d a b)
dlswap ( f: a b c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: a b c d a b)
;

: dltuck dlswap dlover ;

\ dual number funuctions of dula number variables

: dlvariable create 2 floats allot ;
: dl! ( dlvar -- ) ( f: f1 f2 -- ) dup float+ f! f! ;
: dl@ ( dlvar -- ) ( f: -- f1 f2) dup f@ float+ f@ ;

: dlident ( f: a b -- a b) ;
: dlsin ( f: a b -- c d) fswap fdup fsin fswap fcos frot f* ;
: dlcos ( f: a b -- c d) fswap fdup fcos fswap fsin fnegate frot f* ;
: dlexp ( f: a b -- c d) fswap fdup fexp fswap fexp frot f* ;
: dlln ( f: a b -- c d) fswap fdup fln fswap 1/f frot f* ;
\ ..... add others

\ derivatives
variable func
: der 1e ' func ! func @ execute dl>eps ;

\ examples
1 [if]
: dlf() 1e 0e dl+ ; \ f(x) = x + 1
: dlf2() dldup dl* ; \ f2(x) = x^2

: dlf3() dldup dlf2() dlswap dlf() dl+ ; \ f3(x) = x^2 + x + 1
: der_f3() ( f: x -- y) 2e f* 1e f+ ; \ d/dx(f3) = 2*x + 1
cr 1e der dlf3() f. \ 3. ok
cr 1e der_f3() f. \ 3. ok
cr cr

: dlf4() dlf3() dlexp ; \ f3(x) = exp(x^2+x+1)
: der_f4() ( f: x -- y) \ d/dx(f4) = (2*x+1)*exp(x^2+x+1)
fdup 2e f* 1e f+ fswap
fdup fdup f* f+ 1e f+ fexp f*
;
cr 2e der dlf4() f. \ 5483.16579214229 ok, calculated at 2e
cr 2e der_f4() f. \ 5483.16579214229 ok
cr cr

: dlf5() 1/dl ;
: der_f5() ( f: x) \ d/dx(f5) = -1/x^2
fdup f* 1/f fnegate
;

cr 2e der dlf5() f. \ -0.25 ok calculated at 2e
cr 2e der_f5() f. \ -0.25
cr cr

: dlf6() dldup dldup dl* dlswap dlsin dl+ 1/dl ; \ f6(x) =
1/(x^2+sin(x))
\ using the derivative calculated analytically d/dx (f6(x)) =
-(2*x+cos(x))/(x^2+sin(x))^2
: der_f6() ( f: x -- y) fdup fdup fdup f* fswap fsin f+ fdup f* 1/f
fswap fdup 2e f* fswap fcos f+ f* fnegate ;

cr 1e der dlf6() f. \ -0.749127330692909 ok calculated at x = 1
cr 1e der_f6() f. \ -0.749127330692909 ok
cr cr

: dlf7() dldup dldup dl* dlcos dl* ; \ f7(x) = x*cos(x^2),
: der_f7() ( f: x --y) \ its deriv: d/dx(f7) = x*cos(x^2)-2*x*sin(x^2)
fdup fdup f* ( f: x x^2)
fsincos ( f: x s c )
frot ( f: s c x)
ftuck ( f: s x c x)
f* ( f: s x cx)
-frot ( f: cx s x)
f* 2e f* f-
;

cr 1e der_f7() f. \ -1.14263966374765 ok calculated at 1e
cr 1e der dlf7() f. \ -1.14263966374765 ok

cr cr cr

[then]

The code ends here.

Ahmed

ahmed

2024-07-17 16:22:57 UTC

On Wed, 17 Jul 2024 15:26:54 +0000, ahmed wrote:
..

Post by ahmed
: dlf7() dldup dldup dl* dlcos dl* ; \ f7(x) = x*cos(x^2),
: der_f7() ( f: x --y) \ its deriv: d/dx(f7) = x*cos(x^2)-2*x*sin(x^2)
fdup fdup f* ( f: x x^2)
fsincos ( f: x s c )
frot ( f: s c x)
ftuck ( f: s x c x)
f* ( f: s x cx)
-frot ( f: cx s x)
f* 2e f* f-
;
cr 1e der_f7() f. \ -1.14263966374765 ok calculated at 1e
cr 1e der dlf7() f. \ -1.14263966374765 ok

There is an error in der_f7() analytic formula.

the correct version is hereafter:

: dlf7() dldup dldup dl* dlcos dl* ; \ f7(x) = x*cos(x^2),
: der_f7() ( f: x --y) \ its deriv: d/dx(f7) = cos(x^2)-2*x^2*sin(x^2)
fdup f* ( f: x^2)
fdup fsincos ( f: x^2 s c )
-frot ( f: c x^2 s )
f* 2e f* ( f: c 2s*x^2)
f-
;

cr 2e der_f7() f. \ 5.40077634159981 ok calculated at 2e
cr 2e der dlf7() f. \ 5.40077634159981 ok

cr cr cr

ahmed

2024-07-17 16:24:01 UTC

So the whole program is here (with the correction of der_f7())

: -frot frot frot ;
: f>dl ( f: a -- a 0) 0e ;
: dl>rl ( f: a b -- a) fdrop ;
: dl>eps ( f: a b -- b) fnip ;

: dl. ( f: a b -- ) fswap f. ." + eps " f. ;
: dl(,). ( f: a b -- ) ." (" fswap f. ." , " f. ." )" ;

: dl+ ( f: a b c d -- a+c b+d)
frot ( f: a c d b)
f+ ( f: a c b+d)
-frot ( f: b+d a c)
f+ ( f: b+d a+c)
fswap ( f: a+c b+d) ;

: dl- ( f: a b c d -- a-c b-d)
frot ( f: a c d b)
fswap f- ( f: a c b-d)
-frot ( f: b-d a c)
f- ( f: b-d a-c)
fswap ( f: a-c b-d) ;

fvariable b*c
: dl* ( f: a b c d -- a*c a*d+b*c)
-frot ( f: a d b c)
ftuck f* ( f: a d c b*c)
b*c f! ( f: a d c)
frot ( f: d c a)
ftuck ( f: d a c a)
f* ( f: d a a*c)
-frot ( f: a*c d a)
f* b*c f@ f+ ( f: a*c a*d+b*c)
;

: 1/dl ( f: a b -- 1/a -b*1/a^2)
fswap 1/f ( f: b 1/a)
ftuck ( f: 1/a b 1/a)
fdup f* ( f: 1/a b 1/a^2)
f* fnegate ( f: 1/a -b/a^2)
;

: dl/ ( f: a b c d -- a/c rest)
1/dl dl*
;

: dl^f ( f: a b c -- a^c b*c)
ftuck ( f: a c b c)
f* -frot ( f: b*c a c)
f** fswap ( f: a^c b*c)
;

\
: dldup ( f: a b -- a b a b) fover fover ;
: dlnip ( f: a b c d -- c d) frot fdrop frot fdrop ;
: dldrop ( f: a b -- ) fdrop fdrop ;

fvariable dlswap_temp
: dlswap ( f: a b c d -- c d a b)
frot dlswap_temp f! ( f: a c d)
frot ( f: c d a)
dlswap_temp f@ ;

fvariable dlover_temp1
fvariable dlover_temp2
: dlover ( f: a b c d -- a b c d a b)
dlswap ( f: a b c d -- c d a b)
dlover_temp2 f! dlover_temp1 f! ( f: c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: c d a b)
dlswap ( f: a b c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: a b c d a b)
;

: dltuck dlswap dlover ;

\ dual number funuctions of dula number variables

: dlvariable create 2 floats allot ;
: dl! ( dlvar -- ) ( f: f1 f2 -- ) dup float+ f! f! ;
: dl@ ( dlvar -- ) ( f: -- f1 f2) dup f@ float+ f@ ;

: dlident ( f: a b -- a b) ;
: dlsin ( f: a b -- c d) fswap fdup fsin fswap fcos frot f* ;
: dlcos ( f: a b -- c d) fswap fdup fcos fswap fsin fnegate frot f* ;
: dlexp ( f: a b -- c d) fswap fdup fexp fswap fexp frot f* ;
: dlln ( f: a b -- c d) fswap fdup fln fswap 1/f frot f* ;
\ ..... add others

\ derivatives
variable func
: der 1e ' func ! func @ execute dl>eps ;

\ examples
1 [if]
: dlf() 1e 0e dl+ ; \ f(x) = x + 1
: dlf2() dldup dl* ; \ f2(x) = x^2

: dlf3() dldup dlf2() dlswap dlf() dl+ ; \ f3(x) = x^2 + x + 1
: der_f3() ( f: x -- y) 2e f* 1e f+ ; \ d/dx(f3) = 2*x + 1
cr 1e der dlf3() f. \ 3. ok
cr 1e der_f3() f. \ 3. ok
cr cr

: dlf4() dlf3() dlexp ; \ f3(x) = exp(x^2+x+1)
: der_f4() ( f: x -- y) \ d/dx(f4) = (2*x+1)*exp(x^2+x+1)
fdup 2e f* 1e f+ fswap
fdup fdup f* f+ 1e f+ fexp f*
;
cr 2e der dlf4() f. \ 5483.16579214229 ok, calculated at 2e
cr 2e der_f4() f. \ 5483.16579214229 ok
cr cr

: dlf5() 1/dl ;
: der_f5() ( f: x) \ d/dx(f5) = -1/x^2
fdup f* 1/f fnegate
;

cr 2e der dlf5() f. \ -0.25 ok calculated at 2e
cr 2e der_f5() f. \ -0.25
cr cr

: dlf6() dldup dldup dl* dlswap dlsin dl+ 1/dl ; \ f6(x) =
1/(x^2+sin(x))
\ using the derivative calculated analytically d/dx (f6(x)) =
-(2*x+cos(x))/(x^2+sin(x))^2
: der_f6() ( f: x -- y) fdup fdup fdup f* fswap fsin f+ fdup f* 1/f
fswap fdup 2e f* fswap fcos f+ f* fnegate ;

cr 1e der dlf6() f. \ -0.749127330692909 ok calculated at x = 1
cr 1e der_f6() f. \ -0.749127330692909 ok
cr cr

: dlf7() dldup dldup dl* dlcos dl* ; \ f7(x) = x*cos(x^2),
: der_f7() ( f: x --y) \ its deriv: d/dx(f7) = cos(x^2)-2*x^2*sin(x^2)
fdup f* ( f: x^2)
fdup fsincos ( f: x^2 s c )
-frot ( f: c x^2 s )
f* 2e f* ( f: c 2s*x^2)
f-
;

cr 1e der_f7() f. \ -1.14263966374765 ok calculated at 1e
cr 1e der dlf7() f. \ -1.14263966374765 ok
cr cr

cr 2e der_f7() f. \ 5.40077634159981 ok calculated at 2e
cr 2e der dlf7() f. \ 5.40077634159981 ok

cr cr cr

[then]

Ahmed

mhx

2024-07-17 16:42:41 UTC

Post by ahmed
So the whole program is here (with the correction of der_f7())
: -frot frot frot ;
: f>dl ( f: a -- a 0) 0e ;
: dl>rl ( f: a b -- a) fdrop ;
: dl>eps ( f: a b -- b) fnip ;

[..]

Post by ahmed
[then]
Ahmed

Problem solved on one page of Forth :--)

It might already be useful for implementing
a behavioral source in iSPICE.

Try the exponential matrix ( expm(A) where A
is a square matrix ) next.

-marcel

ahmed

2024-07-17 17:17:23 UTC

On Wed, 17 Jul 2024 16:42:41 +0000, mhx wrote:

..

Post by mhx
Try the exponential matrix ( expm(A) where A
is a square matrix ) next.
-marcel

By expm(A) do you mean expm(A*t) where t is time (the independant
variable)
and A the matrix in the state model:

x_dot = A*x + B*u
y = C*x + D*u

and PHI = expm(A*t) the transition matrix that permits to get the time
solution of the state model.

or you mean exactly expm(A), where A is the independant variable.
in the latter case I don't know how to calculate derivatives with
respect to matrices analytically.

for former case, time t is real, then we pass to dual numbers (t,1e) = t
+ 1e*eps where eps^2 =0
then we use the formulas known for floating point numbers
and generate versions for dual numbers and use them as in the posted
program.

Ahmed

minforth

2024-07-17 18:55:49 UTC

Correct, time is scalar and positive and as such propagates through
differentiation. However dt can become negative doing error correction
during reverse accumulation

ahmed

2024-07-18 17:31:18 UTC

Some documentation (pdf slides,...) about automatic differentiation:

https://cseweb.ucsd.edu/~tzli/cse291/sp2024/lectures/

Ahmed

Paul Rubin

2024-07-18 22:14:28 UTC

Some documentation (pdf slides,...) about automatic differentiation: ...

The course description is interesting too:

https://cseweb.ucsd.edu/~tzli/cse291/sp2024/

minforth

2024-07-19 05:42:11 UTC

Looks to me like the basics for "AI making AI-powered robots".

Unfortunately, a large part of our student generation seems
more concerned with wokism, fighting pseudo-Nazis and climate
change, and other 'righteous' battles. While pursuing their
idealisms, I fear they will be swept away by technological
advances they should better be following and learning.

Putting aside this "ranting grandpa attitude", the course
deals with new compiler and optimisation techniques through
introspection and mathematical description of software source
code. I wonder if a tokenised Forth would be a good intermediate
language to study such techniques (although calling Forth a
differentiable language would be a long stretch).

ahmed

2024-07-19 07:39:04 UTC

I added some words and examples for the case of functions with two
variables.
See functions f9() and f10() in the examples

Here the code begins.

: -frot frot frot ;
: f>dl ( f: a -- a 0) 0e ; \ real to dual value
: f>dl_d ( f: a -- a 1) 1e ; \ real to dual value with respect to
differentiate
: dl>rl ( f: a b -- a) fdrop ;
: dl>eps ( f: a b -- b) fnip ;

: dl. ( f: a b -- ) fswap f. ." + eps " f. ;
: dl(,). ( f: a b -- ) ." (" fswap f. ." , " f. ." )" ;

: dl+ ( f: a b c d -- a+c b+d)
frot ( f: a c d b)
f+ ( f: a c b+d)
-frot ( f: b+d a c)
f+ ( f: b+d a+c)
fswap ( f: a+c b+d) ;

: dl- ( f: a b c d -- a-c b-d)
frot ( f: a c d b)
fswap f- ( f: a c b-d)
-frot ( f: b-d a c)
f- ( f: b-d a-c)
fswap ( f: a-c b-d) ;

fvariable b*c
: dl* ( f: a b c d -- a*c a*d+b*c)
-frot ( f: a d b c)
ftuck f* ( f: a d c b*c)
b*c f! ( f: a d c)
frot ( f: d c a)
ftuck ( f: d a c a)
f* ( f: d a a*c)
-frot ( f: a*c d a)
f* b*c f@ f+ ( f: a*c a*d+b*c)
;

: 1/dl ( f: a b -- 1/a -b*1/a^2)
fswap 1/f ( f: b 1/a)
ftuck ( f: 1/a b 1/a)
fdup f* ( f: 1/a b 1/a^2)
f* fnegate ( f: 1/a -b/a^2)
;

: dl/ ( f: a b c d -- a/c rest)
1/dl dl*
;

: dl^f ( f: a b c -- a^c b*c)
ftuck ( f: a c b c)
f* -frot ( f: b*c a c)
f** fswap ( f: a^c b*c)
;

\
: dldup ( f: a b -- a b a b) fover fover ;
: dlnip ( f: a b c d -- c d) frot fdrop frot fdrop ;
: dldrop ( f: a b -- ) fdrop fdrop ;

fvariable dlswap_temp
: dlswap ( f: a b c d -- c d a b)
frot dlswap_temp f! ( f: a c d)
frot ( f: c d a)
dlswap_temp f@ ;

fvariable dlover_temp1
fvariable dlover_temp2
: dlover ( f: a b c d -- a b c d a b)
dlswap ( f: a b c d -- c d a b)
dlover_temp2 f! dlover_temp1 f! ( f: c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: c d a b)
dlswap ( f: a b c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: a b c d a b)
;

: dltuck dlswap dlover ;

: dlvariable create 2 floats allot ;
: dl! ( dlvar -- ) ( f: f1 f2 -- ) dup float+ f! f! ;
: dl@ ( dlvar -- ) ( f: -- f1 f2) dup f@ float+ f@ ;

dlvariable dlrot_temp1
dlvariable dlrot_temp2
: dlrot ( dl: d1 d2 d3 -- d2 d3 d1)
dlrot_temp1 dl! ( dl: d1 d2)
dlswap ( dl: d2 d1)
dlrot_temp2 dl! ( dl: d2)
dlrot_temp1 dl@ ( dl: d2 d3)
dlrot_temp2 dl@ ( dl: d2 d3 d1)
;

\ dual number funuctions of dula number variables

: dlident ( f: a b -- a b) ;
: dlsin ( f: a b -- c d) fswap fdup fsin fswap fcos frot f* ;
: dlcos ( f: a b -- c d) fswap fdup fcos fswap fsin fnegate frot f* ;
: dlexp ( f: a b -- c d) fswap fdup fexp fswap fexp frot f* ;
: dlln ( f: a b -- c d) fswap fdup fln fswap 1/f frot f* ;
\ ..... add others

\ derivatives
variable func
: der 1e ' func ! func @ execute dl>eps ;

\ examples
1 [if]
: dlf() 1e 0e dl+ ; \ f(x) = x + 1
: dlf2() dldup dl* ; \ f2(x) = x^2

: dlf3() dldup dlf2() dlswap dlf() dl+ ; \ f3(x) = x^2 + x + 1
: der_f3() ( f: x -- y) 2e f* 1e f+ ; \ d/dx(f3) = 2*x + 1
cr 1e der dlf3() f. \ 3. ok
cr 1e der_f3() f. \ 3. ok
cr cr

: dlf4() dlf3() dlexp ; \ f3(x) = exp(x^2+x+1)
: der_f4() ( f: x -- y) \ d/dx(f4) = (2*x+1)*exp(x^2+x+1)
fdup 2e f* 1e f+ fswap
fdup fdup f* f+ 1e f+ fexp f*
;
cr 2e der dlf4() f. \ 5483.16579214229 ok, calculated at 2e
cr 2e der_f4() f. \ 5483.16579214229 ok
cr cr

: dlf5() 1/dl ;
: der_f5() ( f: x) \ d/dx(f5) = -1/x^2
fdup f* 1/f fnegate
;

cr 2e der dlf5() f. \ -0.25 ok calculated at 2e
cr 2e der_f5() f. \ -0.25
cr cr

: dlf6() dldup dldup dl* dlswap dlsin dl+ 1/dl ; \ f6(x) =
1/(x^2+sin(x))
\ using the derivative calculated analytically d/dx (f6(x)) =
-(2*x+cos(x))/(x^2+sin(x))^2
: der_f6() ( f: x -- y) fdup fdup fdup f* fswap fsin f+ fdup f* 1/f
fswap fdup 2e f* fswap fcos f+ f* fnegate ;

cr 1e der dlf6() f. \ -0.749127330692909 ok calculated at x = 1
cr 1e der_f6() f. \ -0.749127330692909 ok
cr cr

: dlf7() dldup dldup dl* dlcos dl* ; \ f7(x) = x*cos(x^2),
: der_f7() ( f: x --y) \ its deriv: d/dx(f7) = cos(x^2)-2*x^2*sin(x^2)
fdup f* ( f: x^2)
fdup fsincos ( f: x^2 s c )
-frot ( f: c x^2 s )
f* 2e f* ( f: c 2s*x^2)
f-
;

cr 1e der_f7() f. \ -1.14263966374765 ok calculated at 1e
cr 1e der dlf7() f. \ -1.14263966374765 ok
cr cr

cr 2e der_f7() f. \ 5.40077634159981 ok calculated at 2e
cr 2e der dlf7() f. \ 5.40077634159981 ok
cr cr

: dlf8() \ f8(x) = sin(sin(sin(x)))
dlsin dlsin dlsin
;
: der_f8() \ d/dx(f8) = (sin(sin(x)))'*cos(sin(sin(x)))
\ = (sin(x))'*cos(sin(x)*cos(sin(sin(x)))
\ = cos(x)*cos(sin(x))*cos(sin(sin(x)))

( f: x -- y)
fsincos ( f: s c)
fswap ( f: c s)
fdup ( f: c s s)
fcos ( f: c s cs)
fswap ( f: c cs s)
fsin ( f: c cs ss)
fcos ( f: c cs css)
f* f*
;

cr 2e der dlf8() f. \
cr 2e der_f8() f. \
cr cr

( f9 function)
: dl_f9() ( dl: x y -- z) \ f9(x,y) = x + 5*y + x*y
dlover dlover dl* dlswap 5e f>dl dl* dl+ dl+
;

: df9/dx ( f: x y -- z) \ df9/dx = 1 + y
fnip 1e f+
;

: df9/dy ( f: x y -- z) \ df9/dy = 5 + x
fdrop 5e f+
;

cr 2e f>dl_d 3e f>dl dl_f9() dl>eps f. \
cr 2e 3e df9/dx f. \
cr

cr 2e f>dl 3e f>dl_d dl_f9() dl>eps f. \
cr 2e 3e df9/dy f. \
cr cr

( f10 function)
: dl_f10() ( dl: x y -- z) \ f10(x,y) = exp(x + 5*y) * sin(x*y)
dlover dlover dl* dlsin ( dl: x y sxy)
dlrot dlrot ( dl: sxy x y)
5e f>dl dl* dl+ dlexp ( dl: sxy e^[x+5y])
dl*
;

: df10/dx ( f: x y -- z) \ df10/dx = exp(x+5y)*(sin(x*y)+y*cos(x*y))
fover fover 5e f* f+ fexp ( f: x y e^[x+5y])
frot frot ( f: e^[x+5y] x y)
ftuck ( f: e^[x+5y] y x y)
f* fsincos ( f: e^[x+5y] y sxy cxy)
frot f* f+ ( f: e^[x+5y] sxy+ycxy)
f*
;

: df10/dy ( f: x y -- z) \ df10/dy = exp(x+5y)*(5*sin(x*y) + x*cos(x*y))
fover fover 5e f* f+ fexp ( f: x y e^[x+5y])
frot frot ( f: e^[x+5y] x y)
fover f* ( f: e^[x+5y] x yx)
fsincos ( f: e^[x+5y] x sxy cxy)
frot f* fswap 5e f* f+ ( f: e^[x+5y] 5sxy+xcxy)
f*
;

cr 2e f>dl_d 3e f>dl dl_f10() dl>eps f. \
cr 2e 3e df10/dx f. \
cr

cr 2e f>dl 3e f>dl_d dl_f10() dl>eps f. \
cr 2e 3e df10/dy f. \
cr cr

cr cr cr

[then]

Here the code ends.

Ahmed

ahmed

2024-07-19 09:39:22 UTC

And some other examples for gradient calculation ( 2 variables), with
different implementations (fstack juggling, using fvalues, using
floating point arrays (2 variants))

The code begins here:

cr .( gradient)
: gradient ( f: x y -- df/dx df/dy)
' func !
fover fover ( f: x y x y)
fswap f>dl_d ( f: x y y x 1)
frot f>dl ( f: x y x 1 y 0)
func @ execute fnip ( f: x y df/dx)
frot frot ( f: df/dx x y)
fswap f>dl ( f: df/dx y x 0)
frot f>dl_d ( f: df/dx x 0 y 1)
func @ execute fnip ( f: df/dx df/dy)
;

cr 2e 3e gradient dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using values)
0e fvalue x
0e fvalue y
: grad ( f: x y -- df/dx df/dy)
' dup
to y to x
x f>dl_d y f>dl execute fnip
x f>dl y f>dl_d execute fnip
;

( defined)
cr 2e 3e grad dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using arrays)

create xy 2 floats allot
xy 2 floats erase

: grad1 ( f: x y -- df/dx df/dy)
' dup
xy float+ f!
xy f!

xy f@ f>dl_d xy float+ f@ f>dl execute fnip
xy f@ f>dl xy float+ f@ f>dl_d execute fnip
;

( defined)
cr 2e 3e grad1 dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using arrays)

create xy2 2 floats allot
xy2 2 floats erase

: to_xy2 xy2 float+ f! xy2 f! ;
: df/dx xy2 f@ f>dl_d xy2 float+ f@ f>dl execute fnip ;
: df/dy xy2 f@ f>dl xy2 float+ f@ f>dl_d execute fnip ;

: grad2 ( f: x y -- df/dx df/dy)
' dup
to_xy2
df/dx
df/dy
;

( defined)
cr 2e 3e grad2 dl_f10() fswap cr f. cr f.
cr cr

cr cr cr

The code ends here

This code is the continuity of the previous code in my last post.

Ahmed

minforth

2024-07-19 18:01:39 UTC

I don't know if gforth has z: locals for complex numbers. If yes, they
could be used for dual fp-numbers as well to reduce code overhead and
improve readability.

ahmed

2024-07-19 19:19:24 UTC

Post by minforth
I don't know if gforth has z: locals for complex numbers. If yes, they
could be used for dual fp-numbers as well to reduce code overhead and
improve readability.

Yes, z: locals exists in gforth but I haven't used it. I didn't want to
mix dual and complex numbers.
I have completed the code with the elementary functions and their
derivatives (exp, ln, trigs, inverse trig, hyperbolic, inverse
hyperbolic, ...).
The code is not optimized and can be improved. I've done it just as a
proof of concept.

I added examples for functions with multiple variables (3 variables x y
and z)

Perhaps, one must separate dual numbers, dual number valued functions of
dual numbers variables, ..., and the examples.

Here is the code:

: -frot frot frot ;
: f>dl ( f: a -- a 0) 0e ; \ real to dual value
: f>dl_d ( f: a -- a 1) 1e ; \ real to dual value with respect to
differentiate
: dl>rl ( f: a b -- a) fdrop ;
: dl>eps ( f: a b -- b) fnip ;

: dl. ( f: a b -- ) fswap f. ." + eps " f. ;
: dl(,). ( f: a b -- ) ." (" fswap f. ." , " f. ." )" ;

: dl+ ( f: a b c d -- a+c b+d)
frot ( f: a c d b)
f+ ( f: a c b+d)
-frot ( f: b+d a c)
f+ ( f: b+d a+c)
fswap ( f: a+c b+d) ;

: dl- ( f: a b c d -- a-c b-d)
frot ( f: a c d b)
fswap f- ( f: a c b-d)
-frot ( f: b-d a c)
f- ( f: b-d a-c)
fswap ( f: a-c b-d) ;

fvariable b*c
: dl* ( f: a b c d -- a*c a*d+b*c)
-frot ( f: a d b c)
ftuck f* ( f: a d c b*c)
b*c f! ( f: a d c)
frot ( f: d c a)
ftuck ( f: d a c a)
f* ( f: d a a*c)
-frot ( f: a*c d a)
f* b*c f@ f+ ( f: a*c a*d+b*c)
;

: 1/dl ( f: a b -- 1/a -b*1/a^2)
fswap 1/f ( f: b 1/a)
ftuck ( f: 1/a b 1/a)
fdup f* ( f: 1/a b 1/a^2)
f* fnegate ( f: 1/a -b/a^2)
;

: dl/ ( f: a b c d -- a/c rest)
1/dl dl*
;

: dl^f ( f: a b c -- a^c b*c)
ftuck ( f: a c b c)
f* -frot ( f: b*c a c)
f** fswap ( f: a^c b*c)
;

\
: dldup ( f: a b -- a b a b) fover fover ;
: dlnip ( f: a b c d -- c d) frot fdrop frot fdrop ;
: dldrop ( f: a b -- ) fdrop fdrop ;

fvariable dlswap_temp
: dlswap ( f: a b c d -- c d a b)
frot dlswap_temp f! ( f: a c d)
frot ( f: c d a)
dlswap_temp f@ ;

fvariable dlover_temp1
fvariable dlover_temp2
: dlover ( f: a b c d -- a b c d a b)
dlswap ( f: a b c d -- c d a b)
dlover_temp2 f! dlover_temp1 f! ( f: c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: c d a b)
dlswap ( f: a b c d)
dlover_temp1 f@ dlover_temp2 f@ ( f: a b c d a b)
;

: dltuck dlswap dlover ;

: dlvariable create 2 floats allot ;
: dl! ( dlvar -- ) ( f: f1 f2 -- ) dup float+ f! f! ;
: dl@ ( dlvar -- ) ( f: -- f1 f2) dup f@ float+ f@ ;

dlvariable dlrot_temp1
dlvariable dlrot_temp2
: dlrot ( dl: d1 d2 d3 -- d2 d3 d1)
dlrot_temp1 dl! ( dl: d1 d2)
dlswap ( dl: d2 d1)
dlrot_temp2 dl! ( dl: d2)
dlrot_temp1 dl@ ( dl: d2 d3)
dlrot_temp2 dl@ ( dl: d2 d3 d1)
;

\ dual number funuctions of dula number variables

: dlident ( f: a b -- a b) ;

: dlexp ( f: a b -- c d) fswap fdup fexp fswap fexp frot f* ;
: dlln ( f: a b -- c d) fswap fdup fln fswap 1/f frot f* ;
: dl^ ( dl: a b -- a^b) \ a^b = exp(b*ln(a))
dlswap dlln dl* dlexp
;
: dlsqrt ( dl: x --y) 0.5e f>dl dl^ ;

: dlsin ( f: a b -- c d) fswap fdup fsin fswap fcos frot f* ;
: dlcos ( f: a b -- c d) fswap fdup fcos fswap fsin fnegate frot f* ;
: dltan ( dl: x -- y) dldup dlsin dlswap dlcos dl/ ;
: dlcot ( dl: x -- y) dltan 1/dl ;

: dlsinh ( dl: x -- y) dldup dlexp dlswap -1e f>dl dl* dlexp dl- 2e f>dl
dl/ ;
: dlcosh ( dl: x -- y) dldup dlexp dlswap -1e f>dl dl* dlexp dl+ 2e f>dl
dl/ ;
: dltanh ( dl: x -- y) dldup dlsinh dlswap dlcosh dl/ ;
: dlcoth ( dl: x -- y) dltanh 1/dl ;

: dlasin ( f: a b -- c d) fswap fdup fasin fswap fdup f* 1e fswap f-
fsqrt 1/f frot f* ;
: dlacos ( f: a b -- c d) fswap fdup facos fswap fdup f* 1e fswap f-
fsqrt 1/f fnegate frot f* ;
: dlatan ( f: a b -- c d) fswap fdup fatan fswap fdup f* 1e f+ 1/f frot
f* ;

: dlasinh ( f: a b -- c d) fswap fdup fasin fswap fdup f* 1e f+ fsqrt
1/f frot f* ;
: dlacosh ( f: a b -- c d) fswap fdup facos fswap fdup f* 1e f- fsqrt
1/f fnegate frot f* ;
: dlatanh ( f: a b -- c d) fswap fdup fatanh fswap fdup f* 1e fswap f-
1/f frot f* ;

\ ..... add others

\ derivatives
variable func
: der 1e ' func ! func @ execute dl>eps ;

\ examples
1 [if]
: dlf() 1e 0e dl+ ; \ f(x) = x + 1
: dlf2() dldup dl* ; \ f2(x) = x^2

: dlf3() dldup dlf2() dlswap dlf() dl+ ; \ f3(x) = x^2 + x + 1
: der_f3() ( f: x -- y) 2e f* 1e f+ ; \ d/dx(f3) = 2*x + 1
cr 1e der dlf3() f. \ 3. ok
cr 1e der_f3() f. \ 3. ok
cr cr

: dlf4() dlf3() dlexp ; \ f3(x) = exp(x^2+x+1)
: der_f4() ( f: x -- y) \ d/dx(f4) = (2*x+1)*exp(x^2+x+1)
fdup 2e f* 1e f+ fswap
fdup fdup f* f+ 1e f+ fexp f*
;
cr 2e der dlf4() f. \ 5483.16579214229 ok, calculated at 2e
cr 2e der_f4() f. \ 5483.16579214229 ok
cr cr

: dlf5() 1/dl ;
: der_f5() ( f: x) \ d/dx(f5) = -1/x^2
fdup f* 1/f fnegate
;

cr 2e der dlf5() f. \ -0.25 ok calculated at 2e
cr 2e der_f5() f. \ -0.25
cr cr

: dlf6() dldup dldup dl* dlswap dlsin dl+ 1/dl ; \ f6(x) =
1/(x^2+sin(x))
\ using the derivative calculated analytically d/dx (f6(x)) =
-(2*x+cos(x))/(x^2+sin(x))^2
: der_f6() ( f: x -- y) fdup fdup fdup f* fswap fsin f+ fdup f* 1/f
fswap fdup 2e f* fswap fcos f+ f* fnegate ;

cr 1e der dlf6() f. \ -0.749127330692909 ok calculated at x = 1
cr 1e der_f6() f. \ -0.749127330692909 ok
cr cr

: dlf7() dldup dldup dl* dlcos dl* ; \ f7(x) = x*cos(x^2),
: der_f7() ( f: x --y) \ its deriv: d/dx(f7) = cos(x^2)-2*x^2*sin(x^2)
fdup f* ( f: x^2)
fdup fsincos ( f: x^2 s c )
-frot ( f: c x^2 s )
f* 2e f* ( f: c 2s*x^2)
f-
;

cr 1e der_f7() f. \ -1.14263966374765 ok calculated at 1e
cr 1e der dlf7() f. \ -1.14263966374765 ok
cr cr

cr 2e der_f7() f. \ 5.40077634159981 ok calculated at 2e
cr 2e der dlf7() f. \ 5.40077634159981 ok
cr cr

: dlf8() \ f8(x) = sin(sin(sin(x)))
dlsin dlsin dlsin
;
: der_f8() \ d/dx(f8) = (sin(sin(x)))'*cos(sin(sin(x)))
\ = (sin(x))'*cos(sin(x)*cos(sin(sin(x)))
\ = cos(x)*cos(sin(x))*cos(sin(sin(x)))

( f: x -- y)
fsincos ( f: s c)
fswap ( f: c s)
fdup ( f: c s s)
fcos ( f: c s cs)
fswap ( f: c cs s)
fsin ( f: c cs ss)
fcos ( f: c cs css)
f* f*
;

cr 2e der dlf8() f. \
cr 2e der_f8() f. \
cr cr

( f9 function)
: dl_f9() ( dl: x y -- z) \ f9(x,y) = x + 5*y + x*y
dlover dlover dl* dlswap 5e f>dl dl* dl+ dl+
;

: df9/dx ( f: x y -- z) \ df9/dx = 1 + y
fnip 1e f+
;

: df9/dy ( f: x y -- z) \ df9/dy = 5 + x
fdrop 5e f+
;

cr 2e f>dl_d 3e f>dl dl_f9() dl>eps f. \
cr 2e 3e df9/dx f. \
cr

cr 2e f>dl 3e f>dl_d dl_f9() dl>eps f. \
cr 2e 3e df9/dy f. \
cr cr

( f10 function)
: dl_f10() ( dl: x y -- z) \ f10(x,y) = exp(x + 5*y) * sin(x*y)
dlover dlover dl* dlsin ( dl: x y sxy)
dlrot dlrot ( dl: sxy x y)
5e f>dl dl* dl+ dlexp ( dl: sxy e^[x+5y])
dl*
;

: df10/dx ( f: x y -- z) \ df10/dx = exp(x+5y)*(sin(x*y)+y*cos(x*y))
fover fover 5e f* f+ fexp ( f: x y e^[x+5y])
frot frot ( f: e^[x+5y] x y)
ftuck ( f: e^[x+5y] y x y)
f* fsincos ( f: e^[x+5y] y sxy cxy)
frot f* f+ ( f: e^[x+5y] sxy+ycxy)
f*
;

: df10/dy ( f: x y -- z) \ df10/dy = exp(x+5y)*(5*sin(x*y) + x*cos(x*y))
fover fover 5e f* f+ fexp ( f: x y e^[x+5y])
frot frot ( f: e^[x+5y] x y)
fover f* ( f: e^[x+5y] x yx)
fsincos ( f: e^[x+5y] x sxy cxy)
frot f* fswap 5e f* f+ ( f: e^[x+5y] 5sxy+xcxy)
f*
;

cr 2e f>dl_d 3e f>dl dl_f10() dl>eps f. \
cr 2e 3e df10/dx f. \
cr

cr 2e f>dl 3e f>dl_d dl_f10() dl>eps f. \
cr 2e 3e df10/dy f. \
cr cr

cr .( gradient)
: gradient ( f: x y -- df/dx df/dy)
' func !
fover fover ( f: x y x y)
fswap f>dl_d ( f: x y y x 1)
frot f>dl ( f: x y x 1 y 0)
func @ execute fnip ( f: x y df/dx)
frot frot ( f: df/dx x y)
fswap f>dl ( f: df/dx y x 0)
frot f>dl_d ( f: df/dx x 0 y 1)
func @ execute fnip ( f: df/dx df/dy)
;

cr 2e 3e gradient dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using values)
0e fvalue x
0e fvalue y
: grad ( f: x y -- df/dx df/dy)
' dup
to y to x
x f>dl_d y f>dl execute fnip
x f>dl y f>dl_d execute fnip
;

( defined)
cr 2e 3e grad dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using arrays)

create xy 2 floats allot
xy 2 floats erase

: grad1 ( f: x y -- df/dx df/dy)
' dup
xy float+ f!
xy f!

xy f@ f>dl_d xy float+ f@ f>dl execute fnip
xy f@ f>dl xy float+ f@ f>dl_d execute fnip
;

( defined)
cr 2e 3e grad1 dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using arrays)

create xy2 2 floats allot
xy2 2 floats erase

: to_xy2 xy2 float+ f! xy2 f! ;
: df/dx xy2 f@ f>dl_d xy2 float+ f@ f>dl execute fnip ;
: df/dy xy2 f@ f>dl xy2 float+ f@ f>dl_d execute fnip ;

: grad2 ( f: x y -- df/dx df/dy)
' dup
to_xy2
df/dx
df/dy
;

( defined)
cr 2e 3e grad2 dl_f10() fswap cr f. cr f.
cr cr

cr .( gradient using arrays, 3 variables)

: dl_f11() ( dl : x y z -- t) \ f11(x,y,z) = exp(-x^2)*(cos(2y)+sin(z))
dlsin dlswap 2e f>dl dl* dlcos dl+ ( dl: x cos[2y]+sin[z])
dlswap dldup dl* -1e f>dl dl* dlexp dl*
;

\ for verif
: df11/dx ( f: x y z -- df11/dx) \ df11/dx =
-2*x*exp(-x^2)*(cos(2y)+sin(z))
fsin fswap 2e f* fcos f+ fover f* -2e f* ( f: x
-2*[cos[2y]+sin[z]])
fswap fdup f* -1e f* fexp f*
;

: df11/dy ( f: x y z -- df11/dy) \ df11/dy = exp(-x^2)*(-2)*sin(2*y)
fdrop 2e f* fsin -2e f* fswap fdup f* -1e f* fexp f*
;

: df11/dz ( f: x y z -- df11/dz) \ df11/dz = exp(-x^2)*cos(z)
fnip fcos fswap fdup f* -1e f* fexp f*
;

create xyz 3 floats allot
xyz 3 floats erase

: to_xyz xyz 2 floats + f! xyz float+ f! xyz f! ;
: df/dx xyz f@ f>dl_d xyz float+ f@ f>dl xyz 2 floats + f@ f>dl
execute fnip ;
: df/dy xyz f@ f>dl xyz float+ f@ f>dl_d xyz 2 floats + f@ f>dl
execute fnip ;
: df/dz xyz f@ f>dl xyz float+ f@ f>dl xyz 2 floats + f@ f>dl_d
execute fnip ;

: grad2 ( f: x y z -- df/dx df/dy df/fz)
' dup dup
to_xyz
df/dx
df/dy
df/dz
;

( defined)
cr 1e 2e 3e grad2 dl_f11() frot cr f. fswap cr f. cr f.
cr cr
cr 1e 2e 3e df11/dx f.
cr 1e 2e 3e df11/dy f.
cr 1e 2e 3e df11/dz f.
cr cr

: dl_f12() ( dl: x -- 2^x) 2e f>dl dlswap dl^ ;
: der_f12() ( f: x -- 2^x) \ df12/dx = ln(2) * 2^x
2e fswap f** 2e fln f*
;

cr 2e der dl_f12() f.
cr 2e der_f12() f.
cr
cr 1e der dl_f12() f.
cr 1e der_f12() f.
cr cr

: dl_f13() ( dl: x -- x^2) 2e f>dl dl^ ;
: der_f13() ( f: x -- x^2) \ df12/dx = 2*x
2e f*
;

cr 2e der dl_f13() f.
cr 2e der_f13() f.
cr
cr 1e der dl_f13() f.
cr 1e der_f13() f.
cr cr

cr cr cr

[then]

Here the code ends.

Ahmed

minforth

2024-07-20 06:49:18 UTC

Post by ahmed

Post by minforth
I don't know if gforth has z: locals for complex numbers. If yes, they
could be used for dual fp-numbers as well to reduce code overhead and
improve readability.

Thanks for sharing this concept. It is often surprising in how few
lines of Forth code one can express seemingly complex topics.

Automatic differentiation, especially in its reverse form, is a
fundamental algorithm for e.g. machine learning, thereby processing huge
sparse matrices. Forth is lightyears away from the domains of Jax and
TensorFlow. Nevertheless, I believe that if Forth makes any progress
at all, it should be in the direction of better data handling. You can
do only so much with one or two stacks ...

mhx

2024-07-20 08:37:54 UTC

Post by minforth
Forth is lightyears away from the domains of Jax and
TensorFlow. Nevertheless, I believe that if Forth makes any progress
at all, it should be in the direction of better data handling. You can
do only so much with one or two stacks ...

Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

-marcel

minforth

2024-07-20 09:03:38 UTC

That would mean to make a GPU core to behave like and be programmable
like a general purpose CPU core. I fear generated heat would be the
limit and thus the end of that idea. Modern GPUs have thousands of
cores.

Paul Rubin

2024-07-20 18:12:31 UTC

Post by mhx
Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

That's sort of what the GA144 is...

mhx

2024-07-20 20:55:47 UTC

Post by mhx
Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

That's sort of what the GA144 is...

The transputers were a better fit. I predict there time will come.

-marcel

dxf

2024-07-21 03:42:56 UTC

Post by mhx
Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

That's sort of what the GA144 is...

The transputers were a better fit. I predict there time will come.

Not as specific as Mark 13:30 - but hey, what can one expect from a dabbler ;)

a***@spenarnc.xs4all.nl

2024-07-21 08:40:16 UTC

Post by mhx
Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

That's sort of what the GA144 is...

The transputers were a better fit. I predict there time will come.

S/there/their/

At that time I posed:
" it make no sense to operate cpu's in parallel, unless the CPU's are
as powerful as they can get."
Smoke that GA144 fan's.

The arrival of the a 64 bit T80000 with Ghz clock speeds, 8 1 Ghz links,
is long overdue. With AI assistance the Chinese will be able to pull
it off within a short time.
I plan to open a mirror from my github at gitcode.com .
Time to learn Chinese.

Radar signal processing was a Forte of the transputer at the time.
So support of the military is not out of the question.

Post by mhx
-marcel

Groetjes Albert

--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat purring. - the Wise from Antrim -

mhx

2024-07-21 17:31:16 UTC

Post by a***@spenarnc.xs4all.nl

Post by mhx
Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

That's sort of what the GA144 is...

The transputers were a better fit. I predict there time will come.

S/there/their/

I think that "there" works too. Like Mark 13:30, make sure predictions
are multi-interpretable.

Post by a***@spenarnc.xs4all.nl
" it make no sense to operate cpu's in parallel, unless the CPU's are
as powerful as they can get."

[..]

The transputers were definitely better than a pc XT at the time, and a
T800 could compete with a 386+387. Given a few thousand of them, and
the right problem, they will be interesting.

As a top-of-the-line CPU burns 250 Watts, 2048 of them need 50kW or 159
Euros/hour.

-marcel

minforth

2024-07-22 08:00:53 UTC

Post by mhx
Really? A three-stack Forth with CSP hardware on each GPU core would
be quite a good fit.

That's sort of what the GA144 is...

The transputers were a better fit. I predict there time will come.

I predict that the time of systolic arrays will come. The nodes will not
be transputers, but they will share similar characteristics.

Today's AI is based on neural networks, but mathematically described in
terms of linear algebra. Matrix descriptions are also used for deep
learning
and back-propagation. So there is a split between network topology and
matrix topology (e.g. to keep Bayesian correlations of parameters). This
has a negative impact on memory footprint and power consumption.

By far most correlations occur between neighbouring parameters (which is
natural in artificial or biological networks). This is the reason why
most sparse AI matrices show significant correlations only near their
diagonals. In other words, the biggest part of such matrices are ballast
because they do not contribute to the solution.

Systolic arrays describe networks much better. In addition they provide
an
efficient way to store data directly in the nodes, eliminating the need
to shuffle data around to storage devices. I guess the math is just not
there yet. Time will tell.

ahmed

2024-07-22 10:12:21 UTC

I organized a bit the work.
here are 3 programs:
- dual_numbers.fs, it containes operations and elementary functions
of dual_numbers
- autodiff.fs, it includes dual_numbers.fs, it defines der,
gradient, jacobian
and - ad_tests.fs, it includes autodiff.fs, it containes some examples:
1 func of 1 var ---> der
1 func of 3 var ---> gradient
2 func of 3 vars ---> jacobian
3 func of 3 vars ---> jacobian
4 func of 2 vars ---> jacobian
4 func of 3 vars ---> jacobian

Here, begins the code for dial_numbers.fs
: -frot frot frot ;
: f>dl ( f: a -- a 0) 0e ; \ real to dual value
: f>dl_d ( f: a -- a 1) 1e ; \ real to dual value with respect to
differentiate
: dl>rl ( f: a b -- a) fdrop ;
: dl>eps ( f: a b -- b) fnip ;

: dl. ( f: a b -- ) fswap f. ." + eps " f. ;
: dl(,). ( f: a b -- ) ." (" fswap f. ." , " f. ." )" ;

: dl+ ( f: a b c d -- a+c b+d)
frot ( f: a c d b)
f+ ( f: a c b+d)
-frot ( f: b+d a c)
f+ ( f: b+d a+c)
fswap ( f: a+c b+d) ;

: dl- ( f: a b c d -- a-c b-d)
frot ( f: a c d b)
fswap f- ( f: a c b-d)
-frot ( f: b-d a c)
f- ( f: b-d a-c)
fswap ( f: a-c b-d) ;

fvariable b*c
: dl* ( f: a b c d -- a*c a*d+b*c)
-frot ( f: a d b c)
ftuck f* ( f: a d c b*c)
b*c f! ( f: a d c)
frot ( f: d c a)
ftuck ( f: d a c a)
f* ( f: d a a*c)
-frot ( f: a*c d a)
f* b*c f@ f+ ( f: a*c a*d+b*c)
;

: 1/dl ( f: a b -- 1/a -b*1/a^2)
fswap 1/f ( f: b 1/a)
ftuck ( f: 1/a b 1/a)
fdup f* ( f: 1/a b 1/a^2)
f* fnegate ( f: 1/a -b/a^2)
;

: dl/ ( f: a b c d -- a/c rest)
1/dl dl*
;

: dl^f ( f: a b c -- a^c b*c)
ftuck ( f: a c b c)
f* -frot ( f: b*c a c)
f** fswap ( f: a^c b*c)
;

: dlnegate ( f: a b -- c d)
fnegate fswap fnegate fswap
;
Here, dual_numbers.fs finishes
--------------------------------
Here, autodiff.fs begins
include dual_numbers.fs

\ -----------
\ vector valued functions of several variables

: funcarray ( n -- )
create dup , cells allot
does> dup @
;

: funcarray! ( func address size i --)
nip cells + cell+ !
;

: funcarray@ ( address size i -- func)
nip cells + cell+ @
;

\ derivatives
\ derivative 1 func of 1 var
variable func
: der 1e ' func ! func @ execute dl>eps ;

\ gradient and jacobian
: dlarray ( n --) \ does part ( -- address size)
create dup , 2* floats allot does> dup @ ;

: dlarray! ( address size i --) ( dl: dl --)
nip 2* floats + cell+ dl! ;

: dlarray@ ( address size i --) ( dl: -- dl)
nip 2* floats + cell+ dl@ ;

variable func
: (gradient) ( address size --)
' func !
dup 0 do
dup 0 do
2dup i dlarray@
i j = if
0e 1e dl+
then
loop
func @ execute dl>eps
loop
2drop
;

: ([gradient]) ( address size --)
dup 0 do
dup 0 do
2dup i dlarray@
i j = if
0e 1e dl+
then
loop
func @ execute dl>eps
loop
2drop
;

: (.gradient)
' func !
dup 0 do
dup 0 do
2dup i dlarray@
i j = if
0e 1e dl+
then
loop
func @ execute dl>eps
cr f.
loop
2drop
;

: ([.gradient])
dup 0 do
dup 0 do
2dup i dlarray@
i j = if
0e 1e dl+
then
loop
func @ execute dl>eps
f.
loop
2drop
;

\ gradient
: .gradient 0 funcarray@ func ! ([.gradient]) ;
: gradient 0 funcarray@ func ! ([gradient]) ;

\ jacobian
2variable (funcarray)
2variable (dlarray)

: .jacobian
(funcarray) 2!
(dlarray) 2!
(funcarray) 2@ nip 0 do
(funcarray) 2@ i funcarray@ func !
(dlarray) 2@ ([.gradient])
cr
loop
;

: jacobian
(funcarray) 2!
(dlarray) 2!
(funcarray) 2@ nip 0 do
(funcarray) 2@ i funcarray@ func !
(dlarray) 2@ ([gradient])
loop
;

Here, autodiff.fs finishes
-----------------------------
Here, the ad_tests.fs begins

include autodiff.fs

\ 1 func of 1 var
: dl_f() ( dl: x -- y) dldup dlsin dl* ; \ f(x) = x*sin(x)
\ for x = 1, df/fx = ?
cr 1e der dl_f() f.
: df/dx ( f: x -- y) fdup fdup fcos f* fswap fsin f+ ;
cr 1e df/dx f.
cr cr

\ 3 variables x, y and z in the array xyz
3 dlarray xyz

\ x = 1, y = 2 and z = 3
1e 0e xyz 0 dlarray!
2e 0e xyz 1 dlarray!
3e 0e xyz 2 dlarray!

\ 1 func of 3 vars
: dl_f() ( dl: x y z -- x + y * z) dl* dl+ ;

cr xyz (gradient) dl_f() frot f. fswap f. f.
cr xyz (.gradient) dl_f()
cr

\ -----------
\ vector valued functions
\ 1 func of 3 vars
1 funcarray f()
' dl_f() f() 0 funcarray!

cr xyz f() .gradient
\ cr xyz f() gradient \ the gradient values are in fpstack ( f: df/dx
df/dy df/dz)

\ 2 func of 3 var
: dl_f1() ( dl: x y z -- x+y*z) dl* dl+ ;
: dl_f2() ( dl: x y z -- x*y+z) dlrot dlrot dl* dl+ ;

2 funcarray f2()
' dl_f1() f2() 0 funcarray!
' dl_f2() f2() 1 funcarray!

cr xyz f2() .jacobian
\ cr xyz f2() jacobian \ the jacobian values are in fpstack ( f: df1/dx
df1/dy df1/dz df2/dx df2/dy df3/dz )

\ -----------
\ 3 func of 3 var

: dl_f1() ( dl: x y z -- x+y*z) dl* dl+ ;
: dl_f2() ( dl: x y z -- x*y+z) dlrot dlrot dl* dl+ ;
: dl_f3() ( dl: x y z -- y+z*x) dlrot dl* dl+ ;

3 funcarray f3()
' dl_f1() f3() 0 funcarray!
' dl_f2() f3() 1 funcarray!
' dl_f3() f3() 2 funcarray!

cr xyz f3() .jacobian
\ cr xyz f3() jacobian \ the jacobian values are in fpstack ( f: df1/dx
df1/dy df1/dz ... df3/dx df3/dy df3/dz )

\ -----------
\ 4 func of 2 var

2 dlarray xy
5e f>dl xy 0 dlarray!
6e f>dl xy 1 dlarray!

: dl_f1() ( dl: x y -- x+2*y) 2e f>dl dl* dl+ ;
: dl_f2() ( dl: x y -- x*y) dl* ;
: dl_f3() ( dl: x y -- x^y) dl^ ;
: dl_f4() ( dl: x y -- y^x) dlswap dl^ ;

4 funcarray f4()
' dl_f1() f4() 0 funcarray!
' dl_f2() f4() 1 funcarray!
' dl_f3() f4() 2 funcarray!
' dl_f4() f4() 3 funcarray!

cr xy f4() .jacobian
\ cr xy f4() jacobian \ the jacobian values are in fpstack ( f: df1/dx
df1/dy df1/dz ... df4/dx df4/dy df4/dz )

\ 4 func of 3 var
: dl_f1() ( dl: x y z -- x+y*z) dl* dl+ ;
: dl_f2() ( dl: x y z -- x*y+z) dlrot dlrot dl* dl+ ;
: dl_f3() ( dl: x y z -- y+z*x) dlrot dl* dl+ ;
: dl_f4() ( dl: x y z -- x*y*z) dl* dl* ;

4 funcarray f4()
' dl_f1() f4() 0 funcarray!
' dl_f2() f4() 1 funcarray!
' dl_f3() f4() 2 funcarray!
' dl_f4() f4() 3 funcarray!

cr xyz f4() .jacobian
\ cr xyz f4() jacobian \ the jacobian values are in fpstack ( f: df1/dx
df1/dy df1/dz ... df4/dx df4/dy df4/dz )

cr cr .( done) cr cr

Here, ad_tests.fs finishes

Ahmed

minforth

2024-07-24 10:25:48 UTC

Thank you! In addition, I found a nice introduction to AD here:
https://mostafa-samir.github.io/auto-diff-pt2/

ahmed

2024-07-25 05:56:09 UTC

Thanks,
I found this book on the web

Evaluating Derivatives
Principles and Techniques of Algorithmic Differentiation

by Andreas Griewank

Ahmed

ahmed

2024-07-26 12:27:06 UTC

There is another problem with dl^.

The definition of dl^ given previously has problems:

a b dl^ doesn't work when:

- a is a dual number with negative real part ( for example: a =
-5.6 + eps c, c in R)
and (in the same time)
- b is a dual number with real part is integer and even ( for
example: b = 8 + eps d, d in R)

For floats, we have -5.6e 8e f** f. 967173.11574016 ok
But with dl^, we get : -5.6e f>dl 8e f>dl dl^ dl. NaN+ eps NaN ok

Here is the new definition of dl^, it corrects this problem.

: is_even_integer? ( f: a --) ( -- f)
fdup f>s dup s>f f=
swap 2 mod 0= and
;

: dl^ ( dl: a b -- a^b) \ a^b = exp(b*ln(a))
dldup dl>rl f0= if dldrop dldrop 1e 0e exit then
dldup dl>rl is_even_integer? if
dlswap
fswap fabs fswap
dlln dl* dlexp exit
then
dlover dl>rl f0<> if dlswap dlln dl* dlexp exit then
dldrop dldrop 0e 0e
;
For the same example, it gives
-5.6e f>dl 8e f>dl dl^ dl. 967173.11574016 + eps 0.
which is ok.

Ahmed

ahmed

2024-07-26 14:07:03 UTC

In fact, dl^ has too many use cases.
in the last post I've given a definition for dl^, it works for
calculating the values (real part)
but it fails to get the eps part (that gives the derivative).
Here a new definition that corrects this problem:

: is_even_integer? ( f: a --) ( -- f)
fdup f>s dup s>f f=
swap 2 mod 0= and
;

: dl^ ( dl: a b -- a^b) \ a^b = exp(b*ln(a))
dldup dl>rl is_even_integer? 0=
dlover dl>rl f0< and if
dldrop dldrop
cr ." Can't be defined for Real numbers!"
exit
then

dldup dl>rl f0= if dldrop dldrop 1e 0e exit then

dldup dl>eps f0=
dlover f0= and
f0< and
dldup dl>rl is_even_integer? and
if
dlswap
dlnegate
dlln dl* dlexp
fdrop 0e
exit
then

dlover f0=
f0< and
dldup dl>rl is_even_integer? and
if
dlover
dlnegate
dlln dl* dlexp
dlswap dl>rl fln fnip
exit
then

dlover f0<>
f0< and
dldup dl>rl is_even_integer? and
if
dlswap
dlnegate
dlln dl* dlexp
exit
then

dlover dl>rl f0<> if dlswap dlln dl* dlexp exit then
dldrop dldrop 0e 0e
;

And here, some examples:

-1e f>dl 2e f>dl dl^ dl. 1. + eps 0. ok
-1e f>dl_d 2e f>dl dl^ dl. 1. + eps -2. ok
-1e f>dl 2e f>dl_d dl^ dl. 1. + eps NaN ok
-1e f>dl 2.2e f>dl dl^
Can't be defined for Real numbers! ok
-1.2e f>dl 2e f>dl dl^ dl. 1.44 + eps 0. ok
-1.2e f>dl_d 2e f>dl dl^ dl. 1.44 + eps -2.4 ok
-1.2e f>dl 2e f>dl_d dl^ dl. 1.44 + eps NaN ok
-1.2e f>dl 2.2e f>dl_d dl^
Can't be defined for Real numbers! ok
-1.2e f>dl -2.2e f>dl dl^
Can't be defined for Real numbers! ok
-1.2e f>dl -2e f>dl dl^ dl. 0.694444444444445 + eps 0. ok
-1.2e f>dl_d -2e f>dl dl^ dl. 0.694444444444445 + eps 1.15740740740741
ok
-1.2e f>dl -2e f>dl_d dl^ dl. 0.694444444444445 + eps NaN ok
-1.2e f>dl 0e f>dl_d dl^ dl. 1. + eps 0. ok
-1.2e f>dl 0e f>dl dl^ dl. 1. + eps 0. ok
-1.2e f>dl_d 0e f>dl dl^ dl. 1. + eps 0. ok
0e f>dl_d 0e f>dl dl^ dl. 1. + eps 0. ok

It is ok for the real part and the eps part.

It is not optimized.

Ahmed

ahmed

2024-07-25 09:44:54 UTC