Reverse SCAN SPLIT

Discussion:

Reverse SCAN SPLIT

(too old to reply)

dxf

2024-10-07 08:52:15 UTC

Earlier I mentioned scanning in reverse. Here's an implementation.

[undefined] dxforth [if]
: \CHAR ( a u -- a2 u2 c ) 1- 2dup + c@ ;
[then]

\ As for SCAN but scan from end
: SCAN< ( a u c -- a2 u2 | a 0 )
rot drop rdrop ;

\ As for SPLIT but scan from end. Latter string is topmost.
: SPLIT< ( a u c -- a2 u2 a3 u3 )

r 2dup r> scan< 2swap 2 pick /string ;

\ example

: /T ( a u -- hour min sec )
3 0 do
[char] : split< 0 0 2swap >number 2drop drop -rot
( u ... a u) dup if 1- then
loop 2drop swap rot ;

: T /t cr rot . ." hr " swap . ." min " . ." sec " ;

s" 1:2:3" t
s" 02:03" t
s" 06:" t
s" 03" t
s" 23:59:59" t
s" 0:00:03" t

Ahmed

2024-10-07 09:55:34 UTC

What about this:

: :t ( add cnt -- add 2 1 | add1 2 add2 2 2 | add1 2 add2 2 add3 2 3)
0 -rot bounds dup >r swap do
i c@ [char] : = if 1+ i 1+ 2 rot then
1 -loop 1+ r> 2 rot ;

: .t ( n --)
case
1 of type space ." sec" endof
2 of type space ." min" space type space ." sec" endof
3 of type space ." hrs" space type space ." min" space type space
" sec" endof
endcase ;

s" 10:20:30" :t .t 10 hrs 20 min 30 sec
s" 20:30" :t .t 20 min 30 sec
s" 30" :t .t 30 sec

Ahmed

Ahmed

2024-10-07 10:03:28 UTC

And with 00 for hours and minutes when they are absent

: :t ( add cnt -- add 2 1 | add1 2 add2 2 2 | add1 2 add2 2 add3 2 3)
0 -rot bounds dup >r swap do
i c@ [char] : = if 1+ i 1+ 2 rot then
1 -loop 1+ r> 2 rot ;

: .t ( n --)
case
1 of ." 00 hrs" space ." 00 min" space type space ." sec" endof
2 of ." 00 hrs" space type space ." min" space type space ." sec"
endof
3 of type space ." hrs" space type space ." min" space type space
" sec" endof
endcase ;

s" 10:20:30" :t .t 10 hrs 20 min 30 sec ok
s" 20:30" :t .t 00 hrs 20 min 30 sec ok
s" 30" :t .t 00 hrs 00 min 30 sec ok

Ahmed

dxf

2024-10-07 12:07:16 UTC

Post by Ahmed
And with 00 for hours and minutes when they are absent
: :t ( add cnt -- add 2 1 | add1 2 add2 2 2 | add1 2 add2 2 add3 2 3)
0 -rot bounds dup >r swap do
1 -loop 1+ r> 2 rot ;
: .t ( n --)
case
    1 of ." 00 hrs"   space ." 00 min" space type space ." sec" endof
    2 of ." 00 hrs"   space type space ." min" space type space ." sec"
endof
    3 of type space ." hrs"   space type space ." min" space type space
" sec" endof
endcase ;
s" 10:20:30" :t .t 10 hrs 20 min 30 sec ok
s" 20:30" :t .t 00 hrs 20 min 30 sec ok
s" 30" :t .t 00 hrs 00 min 30 sec ok

Interesting. I'd do the numeric conversion in the main routine if possible.
There's a parsing issue with s" :30"

Ahmed

2024-10-07 19:25:20 UTC

On Mon, 7 Oct 2024 12:07:16 +0000, dxf wrote:

..

Post by dxf
Interesting. I'd do the numeric conversion in the main routine if possible.
There's a parsing issue with s" :30"

And what about this:

: :t ( add cnt -- add 2 1 | add1 2 add2 2 2 | add1 2 add2 2 add3 2 3)
bounds ( end start)
dup ( end start start)

Post by dxf
r ( end start ) ( r: start)

swap ( start end ) ( r: start)
dup ( start end pa)
-rot ( pa start end )
do ( pa)
i ( pa add)
c@ ( pa c)
[char] : = ( pa f)
if ( pa)
i ( pa add)
- ( pa-add)
dup ( pa-add pa-add)
2 ( pa-add pa-add 2)

Post by dxf
( pa-add t|f)

if ( pa-add)
drop ( )
i ( add)
dup ( add add)
1+ ( add add+1)
2 ( add add+1 2)
rot ( add+1 2 add)
else ( pa-add)
1 = if ( )
s" 00" ( add 2)
i ( add 2 add)
else ( )
i ( add)
dup 1+ 1 ( add add+1 1)
rot ( add+1 1 add)
then
then
then
-1 +loop ( ... add+1 1|2 pa)
r> ( pa start)
tuck ( start pa start)
- ( start pa-st)
dup 0= if 2drop s" 00" then
;

: .t ( s_add s_cnt m_add m_cnt h_add h_cnt)
type space ." hr" space
type space ." min" space
type space ." sec"
;

with stack juggling !!!!!!!!!!

Some tests:

s" 10:1:2" :t .t 10 hr 1 min 2 sec ok
s" :10:" :t .t 00 hr 10 min 00 sec ok
s" ::" :t .t 00 hr 00 min 00 sec ok
s" ::1" :t .t 00 hr 00 min 1 sec ok
s" :10:1" :t .t 00 hr 10 min 1 sec ok
s" :10:" :t .t 00 hr 10 min 00 sec ok
s" 10:10:" :t .t 10 hr 10 min 00 sec ok
s" 10::" :t .t 10 hr 00 min 00 sec ok

Ahmed

dxf

2024-10-08 02:58:47 UTC

Post by Ahmed
..

Interesting. I'd do the numeric conversion in the main routine if
possible.
There's a parsing issue with s" :30"

: :t ( add cnt -- add 2 1 | add1 2 add2 2 2 | add1 2 add2 2 add3 2 3)
bounds ( end start)
dup     ( end start start)
>r      ( end start )   ( r: start)
swap    ( start end   ) ( r: start)
dup       ( start end pa)
-rot    ( pa start end )
do      ( pa)
    i    ( pa add)
    [char] : = ( pa f)
    if         ( pa)
       i       ( pa add)
       -       ( pa-add)
    dup     ( pa-add pa-add)
       2       ( pa-add pa-add 2)
       >       ( pa-add t|f)
       if      ( pa-add)
      drop ( )
         i     ( add)
         dup     ( add add)
         1+    ( add add+1)
         2     ( add add+1 2)
         rot   ( add+1 2 add)
       else    ( pa-add)
         1 = if ( )
        s" 00" ( add 2)
        i      ( add 2 add)
      else ( )
           i     ( add)
           dup 1+ 1 ( add add+1 1)
           rot     ( add+1 1 add)
         then
    then
    then
    -1 +loop    ( ... add+1 1|2 pa)
    r>          (               pa start)
    tuck        (               start pa start)
    -           (               start pa-st)
    dup 0= if 2drop s" 00" then
;
: .t ( s_add s_cnt m_add m_cnt h_add h_cnt)
type space ." hr"   space
type space ." min" space
type space ." sec"
;
with stack juggling !!!!!!!!!!

swap dup -rot --> over

Post by Ahmed
s" 10:1:2" :t .t 10 hr 1 min 2 sec ok
s" :10:" :t .t 00 hr 10 min 00 sec ok
s" ::" :t .t 00 hr 00 min 00 sec ok
s" ::1" :t .t 00 hr 00 min 1 sec ok
s" :10:1" :t .t 00 hr 10 min 1 sec ok
s" :10:" :t .t 00 hr 10 min 00 sec ok
s" 10:10:" :t .t 10 hr 10 min 00 sec ok
s" 10::" :t .t 10 hr 00 min 00 sec ok

I wouldn't care about "x:" or "x::x"
Maybe also ":x"
But "5" needs to work :)

Ahmed

2024-10-08 06:02:06 UTC

On Tue, 8 Oct 2024 2:58:47 +0000, dxf wrote:
..

Post by dxf
swap dup -rot --> over

Changed
...

Post by dxf
But "5" needs to work :)

I think now it works.

Here is the new version:

: :t ( add cnt -- add 2 1 | add1 2 add2 2 2 | add1 2 add2 2 add3 2 3)
0 ( add cnt n)
-rot ( n add cnt)
bounds ( n end start)
dup ( n end start start)

Post by dxf
r ( n end start ) ( r: start)

over ( n pa start end )
do ( n pa)
i ( n pa add)
c@ ( n pa c)
[char] : = ( n pa f)
if ( n pa)
swap ( pa n)
1+ ( pa n+1)
i ( pa n+1 add)
rot ( n+1 add pa)
swap ( n+1 pa add)
- ( n+1 pa-add)
dup ( n+1 pa-add pa-add)
2 ( n+1 pa-add pa-add 2)

Post by dxf
( n+1 t|f)

if ( n+1 pa-add)
drop ( n+1)
i ( n+1 add)
swap ( add n+1)
i ( add n+1 add)
1+ ( add n+1 add+1)
2 ( add n+1 add+1 2)
rot ( add add+1 2 n+1)

Post by dxf
r ( add add+1 2 ) ( r: n+1)

rot ( add+1 2 add)
r> ( add+1 2 add n+1)
swap ( add+1 2 n+1 add)
else ( n+1 pa-add)
1 = if ( n+1)
s" 00" ( n+1 add 2)
rot ( add 2 n+1)
i ( add 2 n+1 add)
else ( n+1)
i ( n+1 add)
swap ( add n+1)
i 1+ 1 ( add n+1 add+1 1)
rot ( add add+1 1 n+1)

Post by dxf
r ( add add+1 1) ( r: n+1)

rot ( add+1 1 add)
r> ( add+1 1 add n+1)
swap ( add+1 1 n+1 add)
then
then
then
-1 +loop ( ... add+1 1|2 n pa)
r> ( n pa start)
tuck ( n start pa start)
- ( n start pa-st)
dup 0= if 2drop else rot 1+ then ;

: .t ( n --)
case
1 of ." 00 hr" space ." 00 min" space type space ." sec" endof
2 of ." 00 hr" space type space ." min" space type space ." sec"
endof
3 of type space ." hr" space type space ." min" space type space
" sec" endof
endcase ;

Some tests:

s" " :t .t ok
s" 1" :t .t 00 hr 00 min 1 sec ok
s" 15" :t .t 00 hr 00 min 15 sec ok
s" :15" :t .t 00 hr 00 min 15 sec ok
s" 2:15" :t .t 00 hr 2 min 15 sec ok
s" 20:15" :t .t 00 hr 20 min 15 sec ok
s" :20:15" :t .t 00 hr 20 min 15 sec ok
s" 3:20:15" :t .t 3 hr 20 min 15 sec ok
s" 13:20:15" :t .t 13 hr 20 min 15 sec ok
s" 1:2:1" :t .t 1 hr 2 min 1 sec ok
s" 1::1" :t .t 1 hr 00 min 1 sec ok
s" ::1" :t .t 00 hr 00 min 1 sec ok
s" :1:" :t .t 00 hr 1 min 00 sec ok
s" 1:1:" :t .t 1 hr 1 min 00 sec ok
s" 1::" :t .t 1 hr 00 min 00 sec ok
s" :1" :t .t 00 hr 00 min 1 sec ok
s" :" :t .t 00 hr 00 min 00 sec ok
s" ::" :t .t 00 hr 00 min 00 sec ok

Ahmed

Ruvim

2024-10-07 10:02:05 UTC

Post by dxf
Earlier I mentioned scanning in reverse. Here's an implementation.
[undefined] dxforth [if]
[then]
\ As for SCAN but scan from end
: SCAN< ( a u c -- a2 u2 | a 0 )
rot drop rdrop ;
\ As for SPLIT but scan from end. Latter string is topmost.
: SPLIT< ( a u c -- a2 u2 a3 u3 )

r 2dup r> scan< 2swap 2 pick /string ;

\ example
: /T ( a u -- hour min sec )
3 0 do
[char] : split< 0 0 2swap >number 2drop drop -rot
( u ... a u) dup if 1- then
loop 2drop swap rot ;

^^^^^^^^ (1)

Post by dxf
: T /t cr rot . ." hr " swap . ." min " . ." sec " ;

^^^ ^^^^ (2)

Why do you prefer the order ( u.hour u.min u.sec ) rather than ( u.sec
u.min u.hour ) ?

The later order makes code simpler in (1) and (2), also, it follows the
order of parameters in `TIME&DATE`.

Moreover, if you want to convert three components to seconds, you need
to reverse their order again:

: time3-to-seconds ( u.hour u.min u.sec -- u.sec-total )
swap rot 60 * + 60 * +
;

If the order of parameters is reversed, the above definition takes a
simpler form:

: time3-to-seconds ( u.sec u.min u.hour -- u.sec-total )
60 * + 60 * +
;

--
Ruvim

dxf

2024-10-07 11:22:57 UTC

...
Why do you prefer the order ( u.hour u.min u.sec ) rather than ( u.sec u.min u.hour ) ?
The later order makes code simpler in (1) and (2), also, it follows the order of parameters in `TIME&DATE`.
: time3-to-seconds ( u.hour u.min u.sec -- u.sec-total )
swap rot 60 * + 60 * +
;
: time3-to-seconds ( u.sec u.min u.hour -- u.sec-total )
60 * + 60 * +
;

I thought it would be easier to convert to total secs (on 16-bit it has to be a double).
But perhaps not.

dxf

2024-10-07 12:13:53 UTC

Post by dxf
Earlier I mentioned scanning in reverse. Here's an implementation.
[undefined] dxforth [if]
[then]
\ As for SCAN but scan from end
: SCAN< ( a u c -- a2 u2 | a 0 )
rot drop rdrop ;
\ As for SPLIT but scan from end. Latter string is topmost.
: SPLIT< ( a u c -- a2 u2 a3 u3 )

r 2dup r> scan< 2swap 2 pick /string ;

Checking I see F-PC has -SCAN which appears to work similarly to SCAN< .
I'll be renaming my def's to -SCAN and -SPLIT respectively.

dxf

2024-10-10 06:00:43 UTC

Post by dxf
Earlier I mentioned scanning in reverse. Here's an implementation.
[undefined] dxforth [if]
[then]
\ As for SCAN but scan from end
: SCAN< ( a u c -- a2 u2 | a 0 )
rot drop rdrop ;

Cleaned-up version:

: SCAN< ( a u c -- a2 u2 | a 0 )

Post by dxf
\ As for SPLIT but scan from end. Latter string is topmost.
: SPLIT< ( a u c -- a2 u2 a3 u3 )

r 2dup r> scan< 2swap 2 pick /string ;

a***@spenarnc.xs4all.nl

2024-10-10 08:00:02 UTC

Post by dxf
Earlier I mentioned scanning in reverse. Here's an implementation.
[undefined] dxforth [if]
[then]
\ As for SCAN but scan from end
: SCAN< ( a u c -- a2 u2 | a 0 )
rot drop rdrop ;

Compare that with the meticulously designed, exhaustedly specified
and eminently useful -- $/ -- .
After 40 years it has not taken over the world ...

NAME: $/

STACKEFFECT: sc c --- sc1 sc2

DESCRIPTION: []

Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and sc2
may be empty strings (i.e. their count is zero), if c is the last or
first character in sc .
(sc is c-addr len )

The subtle difference between an empty string (a-add 0 ) and
a null-string ( 0 0 ) allows you to handle empty lines in a file
gracefully.

Post by dxf
\ As for SPLIT but scan from end. Latter string is topmost.
: SPLIT< ( a u c -- a2 u2 a3 u3 )

r 2dup r> scan< 2swap 2 pick /string ;

If you go for SPLIT< from end define SCAN< .
I have named it $\

Get the name of an executable from the source file:
"aap.frt" &. $\ 2DROP TYPE
aap OK

These words are elementary and should be defined in the core
in assembler, possibly (Intel) taking advantage of the string
words.

SSLAS0:
POP AX _C{ char}
POP CX _C{ count}
MOV BX,CX
POP DI _C{ addr}
OR DI,DI _C{Clear zero flag.}
MOV DX,DI _C{ Copy}
CLD _C{ INC DIRECTION}
REPNZ SCASB _C{ Compare BYTE} <<<<<<<<<<<
JZ SSLAS1
<loads of stuff to handle the corner cases)
<SNIP>

Groetjes Albert

--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.

dxf

2024-10-10 09:57:43 UTC

Post by a***@spenarnc.xs4all.nl

Post by dxf
Earlier I mentioned scanning in reverse. Here's an implementation.
[undefined] dxforth [if]
[then]
\ As for SCAN but scan from end
: SCAN< ( a u c -- a2 u2 | a 0 )
rot drop rdrop ;

Compare that with the meticulously designed, exhaustedly specified
and eminently useful -- $/ -- .
After 40 years it has not taken over the world ...
NAME: $/
STACKEFFECT: sc c --- sc1 sc2
DESCRIPTION: []
Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and sc2
may be empty strings (i.e. their count is zero), if c is the last or
first character in sc .
(sc is c-addr len )
The subtle difference between an empty string (a-add 0 ) and
a null-string ( 0 0 ) allows you to handle empty lines in a file
gracefully.

Post by dxf
\ As for SPLIT but scan from end. Latter string is topmost.
: SPLIT< ( a u c -- a2 u2 a3 u3 )

r 2dup r> scan< 2swap 2 pick /string ;

If you go for SPLIT< from end define SCAN< .
I have named it $\
...

I wasn't aware you had reverse split.
I was coming to the conclusion SCAN< as I defined it was of little
value on it's own and planned to subsume it into reverse split.
OTOH a reverse SCAN that gave the same results as forward SCAN might
be useful.

Post by a***@spenarnc.xs4all.nl
"aap.frt" &. $\ 2DROP TYPE
aap OK

I didn't need it for CP/M or DOS but separating filename from filetype
in Windows would be easier when scanned from the end.

Hans Bezemer

2024-10-16 16:13:33 UTC

Post by dxf
I wasn't aware you had reverse split.
I was coming to the conclusion SCAN< as I defined it was of little
value on it's own and planned to subsume it into reverse split.
OTOH a reverse SCAN that gave the same results as forward SCAN might
be useful.

I got a full load of the whole shebang - since I have to parse some
crazy stuff sometimes. It may not be pretty, but it served me well
through the years (since 2004).

Basically I can scan whatever I like however I like it:

---8<---
: (NO) NOT ;
: (YES) ;

defer is-type ( c -- f)

: (-tokenize) ( a1 n2 xt -- a2 n2 )
is ?not begin dup while 2dup 1- chars + c@ is-type ?not while 1- repeat
;
( a1 n2 xt -- a2 n2)
: (tokenize) is ?not begin dup while over c@ is-type ?not while chop
repeat ;
: scan> ['] (no) (tokenize) ; ( a1 n1 -- a2 n2 )
: scan< ['] (no) (-tokenize) ; ( a1 n1 -- a2 n2 )
: skip> ['] (yes) (tokenize) ; ( a1 n1 -- a2 n2 )
: skip< ['] (yes) (-tokenize) ; ( a1 n1 -- a2 n2 )
: split> 2dup scan> 2swap >r over r> swap - ;
: split< dup >r scan< 2dup chars + -rot r> over - -rot ;
( a1 n1 -- a2 n2 a3 n3)
---8<---

It's still 4tH stuff so your mileage may vary.

Hans Bezemer

a***@spenarnc.xs4all.nl

2024-10-17 08:28:26 UTC

Post by Hans Bezemer

Post by dxf
I wasn't aware you had reverse split.
I was coming to the conclusion SCAN< as I defined it was of little
value on it's own and planned to subsume it into reverse split.
OTOH a reverse SCAN that gave the same results as forward SCAN might
be useful.

I got a full load of the whole shebang - since I have to parse some
crazy stuff sometimes. It may not be pretty, but it served me well
through the years (since 2004).
---8<---
: (NO) NOT ;
: (YES) ;
defer is-type ( c -- f)
: (-tokenize) ( a1 n2 xt -- a2 n2 )
;
( a1 n2 xt -- a2 n2)
repeat ;
: scan> ['] (no) (tokenize) ; ( a1 n1 -- a2 n2 )
: scan< ['] (no) (-tokenize) ; ( a1 n1 -- a2 n2 )
: skip> ['] (yes) (tokenize) ; ( a1 n1 -- a2 n2 )
: skip< ['] (yes) (-tokenize) ; ( a1 n1 -- a2 n2 )
: split> 2dup scan> 2swap >r over r> swap - ;
: split< dup >r scan< 2dup chars + -rot r> over - -rot ;
( a1 n1 -- a2 n2 a3 n3)
---8<---
It's still 4tH stuff so your mileage may vary.

Compare to what I'm doing. Promoting the actual API specification
so that you can decide whether you want to actually use it.

$/

STACKEFFECT: sc c --- sc1 sc2

DESCRIPTION: []

Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and sc2
may be empty strings (i.e. their count is zero), if c is the last or
first character in sc .

Post by Hans Bezemer
Hans Bezemer

--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.

minforth

2024-10-17 09:16:09 UTC

All good if the input string contains tokens that are delimited
by special characters. This is not always the case, especially
if tokens follow each other directly without a gap (can happen
when OCR scanning documents, for example). Then you need
lexical tokenisers which are more difficult to implement.

mhx

2024-10-17 10:29:32 UTC

[..]

Post by a***@spenarnc.xs4all.nl
Compare to what I'm doing. Promoting the actual API specification
so that you can decide whether you want to actually use it.
$/
STACKEFFECT: sc c --- sc1 sc2
DESCRIPTION: []
Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and
sc2 may be empty strings (i.e. their count is zero), if c is the
last or first character in sc .

Wil Baden chose to keep c in sc2. Do you have a reason to
remove it?

It seems logical to remove it. I normally use lots of
`1 /STRING' and `-LEADING' or `-TRAILING' sequences in further
processing of Split-At-Char results, but not always.
Maybe because an empty sc2 is less informative than an sc2 of
size 1?

-marcel

a***@spenarnc.xs4all.nl

2024-10-17 10:48:24 UTC

Post by mhx
[..]

Post by a***@spenarnc.xs4all.nl
Compare to what I'm doing. Promoting the actual API specification
so that you can decide whether you want to actually use it.
$/
STACKEFFECT: sc c --- sc1 sc2
DESCRIPTION: []
Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and
sc2 may be empty strings (i.e. their count is zero), if c is the
last or first character in sc .

Wil Baden chose to keep c in sc2. Do you have a reason to
remove it?
It seems logical to remove it. I normally use lots of
`1 /STRING' and `-LEADING' or `-TRAILING' sequences in further
processing of Split-At-Char results, but not always.
Maybe because an empty sc2 is less informative than an sc2 of
size 1?

In the rare case that you want the delimiter :
"orang utan" BL $/
( *utan" "orang" )
you simply do
1+
( *utan" "orang " )

The ideom ... $/ DUP WHILE ... is so pervasive that I
must insist that an empty sc2 is immensely informative.

Post by mhx
-marcel

Groetjes Albert

--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.

dxf

2024-10-18 14:41:52 UTC

Post by a***@spenarnc.xs4all.nl

Post by mhx
[..]

Post by a***@spenarnc.xs4all.nl
Compare to what I'm doing. Promoting the actual API specification
so that you can decide whether you want to actually use it.
$/
STACKEFFECT: sc c --- sc1 sc2
DESCRIPTION: []
Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and
sc2 may be empty strings (i.e. their count is zero), if c is the
last or first character in sc .

Wil Baden chose to keep c in sc2. Do you have a reason to
remove it?
It seems logical to remove it. I normally use lots of
`1 /STRING' and `-LEADING' or `-TRAILING' sequences in further
processing of Split-At-Char results, but not always.
Maybe because an empty sc2 is less informative than an sc2 of
size 1?

"orang utan" BL $/
( *utan" "orang" )
you simply do
1+
( *utan" "orang " )

Applying 1+ is not foolproof in the case of empty sc2.

a***@spenarnc.xs4all.nl

2024-10-19 11:36:37 UTC

Post by a***@spenarnc.xs4all.nl

Post by mhx
[..]

Post by a***@spenarnc.xs4all.nl
Compare to what I'm doing. Promoting the actual API specification
so that you can decide whether you want to actually use it.
$/
STACKEFFECT: sc c --- sc1 sc2
DESCRIPTION: []
Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and
sc2 may be empty strings (i.e. their count is zero), if c is the
last or first character in sc .

Wil Baden chose to keep c in sc2. Do you have a reason to
remove it?
It seems logical to remove it. I normally use lots of
`1 /STRING' and `-LEADING' or `-TRAILING' sequences in further
processing of Split-At-Char results, but not always.
Maybe because an empty sc2 is less informative than an sc2 of
size 1?

"orang utan" BL $/
( *utan" "orang" )
you simply do
1+
( *utan" "orang " )

Applying 1+ is not foolproof in the case of empty sc2.

Realize that
" " BL $/ results in (OKAY)
"" ""
"" BL $/ results in (this goes wrong)
0.0 ""

You mean if the character is missing, resulting in a sc1
that is a null-string.
If the character is present 1+ works all the time.

If the character is not present, you shouldn't do that, right.
In that case sc1 is a null string, (not merely empty) and
you should test sc1 first.
That could happen if you split a linux files on linefeeds and the
last linefeed is missing. Normally I would do
BEGIN ^J $/ TYPE CR OVER 0= UNTIL
Or splitting on $A.
BEGIN $A $/ TYPE $A EMIT OVER 0= UNTIL

Groetjes Albert

--
Temu exploits Christians: (Disclaimer, only 10 apostles)
Last Supper Acrylic Suncatcher - 15Cm Round Stained Glass- Style Wall
Art For Home, Office And Garden Decor - Perfect For Windows, Bars,
And Gifts For Friends Family And Colleagues.

dxf

2024-10-18 01:10:51 UTC

Post by mhx
[..]

Post by a***@spenarnc.xs4all.nl
Compare to what I'm doing. Promoting the actual API specification
so that you can decide whether you want to actually use it.
$/
STACKEFFECT: sc c --- sc1 sc2
DESCRIPTION: []
Find the first c in the string constant sc and split it at that
address. Return the strings after and before c into sc1 and sc2
respectively. If the character is not present sc1 is a null string
(its address is zero) and sc2 is the original string. Both sc1 and
sc2 may be empty strings (i.e. their count is zero), if c is the
last or first character in sc .

Wil Baden chose to keep c in sc2. Do you have a reason to
remove it?
It seems logical to remove it. I normally use lots of
`1 /STRING' and `-LEADING' or `-TRAILING' sequences in further
processing of Split-At-Char results, but not always.
Maybe because an empty sc2 is less informative than an sc2 of
size 1?

For Baden's implementation (below) it's easier for the application to
discard 'char' than have SPLIT do it.

: SPLIT ( str /str char -- str+i /str-i str i )

Post by mhx
R 2dup R> SCAN 2SWAP 2 PICK - ;

SCAN is/was a machine-code routine and not readily changeable.

20 Replies
16 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

dxf 2024-10-07 08:52:15 UTC

Ahmed 2024-10-07 09:55:34 UTC

Ahmed 2024-10-07 10:03:28 UTC

dxf 2024-10-07 12:07:16 UTC

Ahmed 2024-10-07 19:25:20 UTC

dxf 2024-10-08 02:58:47 UTC

Ahmed 2024-10-08 06:02:06 UTC

Ruvim 2024-10-07 10:02:05 UTC

dxf 2024-10-07 11:22:57 UTC

dxf 2024-10-07 12:13:53 UTC

dxf 2024-10-10 06:00:43 UTC

a***@spenarnc.xs4all.nl 2024-10-10 08:00:02 UTC

dxf 2024-10-10 09:57:43 UTC

Hans Bezemer 2024-10-16 16:13:33 UTC

a***@spenarnc.xs4all.nl 2024-10-17 08:28:26 UTC

minforth 2024-10-17 09:16:09 UTC

mhx 2024-10-17 10:29:32 UTC

a***@spenarnc.xs4all.nl 2024-10-17 10:48:24 UTC

dxf 2024-10-18 14:41:52 UTC

a***@spenarnc.xs4all.nl 2024-10-19 11:36:37 UTC

dxf 2024-10-18 01:10:51 UTC

about - legalese

Loading...