Discussion:
Order preserving Sort algorithm ?
(too old to reply)
Bob Armstrong
2023-09-02 19:51:14 UTC
Permalink
It's been too long since I checked comp.lang.forth . comp.lang.apl has too little activity to bother with & I didn't realize that is far from true here .

Anybody got a good ` sort algorithm ?
I've been working on getting all my accounting " business ready " in CoSy . It's only been in the last few years approaching the facilities I had in my legacy K ( Kx.com ) CoSy .
See my SV-FIG presentation : https://www.cosy.com/CoSy/y23/SV-FIG_20230826_commentary.html .
I am surprised to find I never implemented a variant of the sort algorithm I use , https://cosy.com/4thCoSy/Code/CoSy/sort.f , for floats .
The fundamental algorithm , while impressively elegant has a more pervasive problem that it is not order preserving on identical elements . That means if sorting a table using the permutations associated with a sort , the equal elements in sorting of a later column can disrupt the order of previously sorted columns .
APLs & K only return the sorting permutation , aka ` grade , rather than the sorted list itself . I've yet to figure out how to get the permutation ( to apply to all columns in a table ) w/o doing the sort itself .
In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .
Hugh Aguilar
2023-09-03 05:23:02 UTC
Permalink
Post by Bob Armstrong
In any case , if anyone has an order preserving sort which applies to floats
as well as ints & strings , let us know .
I have a MergeSort in the novice package.
AFAIK, it is stable, although I never gave any thought to whether it was or not.

Obviously it works floats, strings, or whatever! Duh!
Nobody with any knowledge of computer science implements complicated
data-structures that are specific to one type of data --- what a waste of time!
My LIST data-structure is general-purpose --- it can contain any kind of data.
You pass the xt of a comparison function into the sort function.

I also have HeapSort for arrays --- AFAIK, it is not stable.
I also have an LLRB tree that orders the data as it is inserted.

I rely heavily on linked lists in my own programming.
I don't use arrays or trees unless there is some compelling reason to do so.
Linked lists work well most of the time --- I got the idea of having a
single data-structure that is generally used from Factor that has "sequences"
that it uses for pretty much all data.
Bob Armstrong
2023-09-11 05:20:13 UTC
Permalink
Post by Hugh Aguilar
Post by Bob Armstrong
In any case , if anyone has an order preserving sort which applies to floats
as well as ints & strings , let us know .
I have a MergeSort in the novice package.
AFAIK, it is stable, although I never gave any thought to whether it was or not.
You pass the xt of a comparison function into the sort function.
That's essentially what's done with Helmar's algorithm .
Post by Hugh Aguilar
I rely heavily on linked lists in my own programming.
I don't use arrays or trees unless there is some compelling reason to do so.
Linked lists work well most of the time --- I got the idea of having a
single data-structure that is generally used from Factor that has "sequences"
that it uses for pretty much all data.
I've disliked linked lists for 60 years . I consider them too complex and difficult to transform .
CoSy is a vocabulary evolved from Arthur Whitney's K of simple lists of lists . I don't think it is possible to match the speed for many operations with linked lists .

I looked at Factor back in http://cosy.com/CoSy/CoSy/ReCoSy20101200.html . And gave some comments on a presentation on Factor to SV-FIG last year ,
. Frankly I think it regressed thru excessive complexification over time .
B. Pym
2024-06-04 02:40:23 UTC
Permalink
Post by Hugh Aguilar
I have a MergeSort in the novice package.
My LIST data-structure is general-purpose --- it can contain
any kind of data.
You pass the xt of a comparison function into the sort function.
I rely heavily on linked lists in my own programming.
Your "Novice Package" includes lists? That is an advanced
feature that doesn't come with GForth or 4th.

People don't like to call themselves novices. It would be
better for marketing to call your library the "Forth Power
Package".
dxf
2024-06-04 05:30:24 UTC
Permalink
Post by B. Pym
Post by Hugh Aguilar
I have a MergeSort in the novice package.
My LIST data-structure is general-purpose --- it can contain
any kind of data.
You pass the xt of a comparison function into the sort function.
I rely heavily on linked lists in my own programming.
Your "Novice Package" includes lists? That is an advanced
feature that doesn't come with GForth or 4th.
People don't like to call themselves novices. It would be
better for marketing to call your library the "Forth Power
Package".
Perhaps he considers them novices in which case it's spot on.
Hans Bezemer
2024-06-13 15:25:49 UTC
Permalink
Post by B. Pym
Your "Novice Package" includes lists? That is an advanced
feature that doesn't come with GForth or 4th.
Not entirely true. 4tH has a library called "leaf", which is basically
an associated array, having a key, value and link field. I made it
originally for indexes to my DBMS - but after careful consideration I
found out I didn't need it.

https://sourceforge.net/p/forth-4th/code/HEAD/tree/trunk/4th.src/lib/leaf.4th

It is, however, extensively used in the INI file manager. This lib
allows you to read *and* modify an INI file. The INI file interpreter
just reads one. It's much simpler.

There's also:
https://sourceforge.net/p/forth-4th/code/HEAD/tree/trunk/4th.src/lib/lists.4th

It may not be much for list addicts, but it's not nothing.

Hans Bezemer

none) (albert
2023-09-03 09:21:11 UTC
Permalink
In article <b0ac03f1-f64f-4fef-aa9a-***@googlegroups.com>,
Bob Armstrong <***@cosy.com> wrote:
<SNIP>
Post by Bob Armstrong
In any case , if anyone has an order preserving sort which applies to
floats as well as ints & strings , let us know .
This is trivial. As long as you can discriminate between objects
they are different. Now add some more test to the comparison.
: <c OVER @ OVER @ < DSSWAP < AND ;

DSSWAP is supposed to swap a double and a single.
[Normal people use either ROT or -ROT, but that is too hard.]

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -
Heinrich Hohl
2023-09-03 12:53:05 UTC
Permalink
Post by Bob Armstrong
In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .
An order-preserving sorting algorithm is also called a "stable" sorting algorithm.

There are two good stable sorting algorithms:

Insertion sort:
An elementary sorting algorithms with sorting time of order O(n^2).
Very simple algorithm. Extremely fast if the number of items to be sorted is small.
Gets too slow if the number of items to be sorted exceeds a certain limit.

Merge sort:
An advanced sorting algorithm with sorting time of order O(n*log(n)).
Top-down approaches (recursive) as well as bottom-up approaches (iterative)
of merge sort are possible and work equally well. More complicated than insertion sort.
Would be needlessly complicated if the number of data items is small.
On the other hand, this algorithm can handle much larger data sets within
reasonable time.

I would suggest that you look at the algorithms and implement the sorting routines
yourself. This is easier than adapting someone else's code.

A good reference is:
Hans Werner Lang: “Sequential and parallel sorting algorithms”
https://hwlang.de/algorithmen/sortieren/algoen.htm

Start with insertion sort and find out if the sorting time is still acceptable in the worst
case scenario (large data set). Otherwise look at Merge sort. The recursive method of
merge sort is slightly simpler to implement than the iterative one.

Bubble sort and Gnome sort are other elementary sorting methods that are also stable.
Both methods are flawed by design and should not be used in real applications.

Henry
Hugh Aguilar
2023-09-03 23:11:48 UTC
Permalink
Post by Heinrich Hohl
An elementary sorting algorithms with sorting time of order O(n^2).
Very simple algorithm. Extremely fast if the number of items to be sorted is small.
Gets too slow if the number of items to be sorted exceeds a certain limit.
I have Insertion Sort for lists too. I used this in my <SWITCH construct primarily because it detects duplicate entries when the duplicate is first found so I can abort the compile with a helpful error message pointing at where the error occurred. Insertion Sort did prove to be quite slow for my processor simulator, but the slowness is at compile-time so it is just a minor nuisance.
Post by Heinrich Hohl
An advanced sorting algorithm with sorting time of order O(n*log(n)).
Top-down approaches (recursive) as well as bottom-up approaches (iterative)
of merge sort are possible and work equally well. More complicated than insertion sort.
Would be needlessly complicated if the number of data items is small.
On the other hand, this algorithm can handle much larger data sets within
reasonable time.
I would suggest that you look at the algorithms and implement the sorting routines
yourself. This is easier than adapting someone else's code.
Nobody has to adapt (rewrite) my code --- it is general-purpose so it can be
put to use unmodified for whatever data you might have.

Reading books on algorithms and studying an "advanced" sort algorithm
such as Merge Sort is a waste of time --- this has been well known for decades.
There is a lot of work involved in writing the code that I have already written.
dxforth
2023-09-04 02:20:22 UTC
Permalink
Post by Hugh Aguilar
Post by Heinrich Hohl
An elementary sorting algorithms with sorting time of order O(n^2).
Very simple algorithm. Extremely fast if the number of items to be sorted is small.
Gets too slow if the number of items to be sorted exceeds a certain limit.
I have Insertion Sort for lists too. I used this in my <SWITCH construct primarily because it detects duplicate entries when the duplicate is first found so I can abort the compile with a helpful error message pointing at where the error occurred. Insertion Sort did prove to be quite slow for my processor simulator, but the slowness is at compile-time so it is just a minor nuisance.
Post by Heinrich Hohl
An advanced sorting algorithm with sorting time of order O(n*log(n)).
Top-down approaches (recursive) as well as bottom-up approaches (iterative)
of merge sort are possible and work equally well. More complicated than insertion sort.
Would be needlessly complicated if the number of data items is small.
On the other hand, this algorithm can handle much larger data sets within
reasonable time.
I would suggest that you look at the algorithms and implement the sorting routines
yourself. This is easier than adapting someone else's code.
Nobody has to adapt (rewrite) my code --- it is general-purpose so it can be
put to use unmodified for whatever data you might have.
Reading books on algorithms and studying an "advanced" sort algorithm
such as Merge Sort is a waste of time --- this has been well known for decades.
There is a lot of work involved in writing the code that I have already written.
If thinking means anything, it is looking at what is useful to one and discarding
what is not.
Bob Armstrong
2023-09-11 06:03:40 UTC
Permalink
Post by Heinrich Hohl
Post by Bob Armstrong
In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .
An order-preserving sorting algorithm is also called a "stable" sorting algorithm.
An advanced sorting algorithm with sorting time of order O(n*log(n)).
Top-down approaches (recursive) as well as bottom-up approaches (iterative)
of merge sort are possible and work equally well. More complicated than insertion sort.
Would be needlessly complicated if the number of data items is small.
On the other hand, this algorithm can handle much larger data sets within
reasonable time.
CoSy is a ` user level system ( altho in open-to-the-chip Forth ) .
An ` ordinary person just wants a sort which works . Thus ` merge is the choice .
Post by Heinrich Hohl
I would suggest that you look at the algorithms and implement the sorting routines
yourself. This is easier than adapting someone else's code.
Win32Forth must have a decent one that I could virtually cut&paste .
Thinking is a last resort . A limited resource .
Post by Heinrich Hohl
Hans Werner Lang: “Sequential and parallel sorting algorithms”
https://hwlang.de/algorithmen/sortieren/algoen.htm
Thanks . That does look rather clear if I have to resort to rolling my own .
Post by Heinrich Hohl
Start with insertion sort and find out if the sorting time is still acceptable in the worst
case scenario (large data set). Otherwise look at Merge sort. The recursive method of
merge sort is slightly simpler to implement than the iterative one.
A great deal of the power of CoSy comes from recursive operations on its lists-of-lists .
I moved from the APL and legacy https://cosy.com/K/CoSy.htm world . If it can't handle Big Data it's not competitive .
L W
2023-09-23 23:27:25 UTC
Permalink
It's been too long since I checked comp.lang.forth . comp.lang.apl has too little activity to bother with & I didn't realize that is far from true here .
Anybody got a good ` sort algorithm ?
I've been working on getting all my accounting " business ready " in CoSy . It's only been in the last few years approaching the facilities I had in my legacy K ( Kx.com ) CoSy .
See my SV-FIG presentation : https://www.cosy.com/CoSy/y23/SV-FIG_20230826_commentary.html .
I am surprised to find I never implemented a variant of the sort algorithm I use , https://cosy.com/4thCoSy/Code/CoSy/sort.f , for floats .
The fundamental algorithm , while impressively elegant has a more pervasive problem that it is not order preserving on identical elements . That means if sorting a table using the permutations associated with a sort , the equal elements in sorting of a later column can disrupt the order of previously sorted columns .
APLs & K only return the sorting permutation , aka ` grade , rather than the sorted list itself . I've yet to figure out how to get the permutation ( to apply to all columns in a table ) w/o doing the sort itself .
In any case , if anyone has an order preserving sort which applies to floats as well as ints & strings , let us know .
Seems to me the only time sorting meant anything was to prepare data to be saved to a GCR tape. Even a tape-to-tape merge/purge from ancient times is essentially a source to destination sort only having to actually sort the introduced data. So let's modernize ... Loading the table elements, given a data standardization exists, into a b+tree index solves the issues. 1) Never have to sort in the first place. 2) Add or remove data any time. 3) Drill down ability. 4) On an Intel system the space is trivial. 5) Order added is preserved on duplicates. 6) Why are you sorting in the first place? 7) Is there a real need to sort? Well, besides a human data presentation?

This comment comes on following an understanding of temperature measurement and dealing with the associated instrumentation, say an 'S' type thermocouple. The linearization process is to keep performing a 7th order polynomial against the acquisition data or calc it once and have a lookup table based on said values. Well, CM would say "Calc it", as memory was rather expensive and limited 'in the day', not so much now, "as there is likely enough processor to do so". When you have 32 bit processors that can run on a 3V lithium coin cell for a decade+ while transmitting sub-GHz data every minute or so the entire while, the battery will die of old age before you deplete it, energy usage is now paramount. Maybe you need to factor your concept on needing to sort ...

Sorting seems from days past a way to benchmark a system ... And only thus.

Just put the data into the index in the first place. The index is the table. Forward, backward, subsets, content, selective resolution, duplicate check, expansion and contraction, you decide the data selection. Sorting ... LOL
none) (albert
2023-09-24 09:57:19 UTC
Permalink
Post by Bob Armstrong
It's been too long since I checked comp.lang.forth . comp.lang.apl has
too little activity to bother with & I didn't realize that is far from
true here .
Anybody got a good ` sort algorithm ?
I've been working on getting all my accounting " business ready " in
CoSy . It's only been in the last few years approaching the facilities I
had in my legacy K ( Kx.com ) CoSy .
https://www.cosy.com/CoSy/y23/SV-FIG_20230826_commentary.html .
I am surprised to find I never implemented a variant of the sort
algorithm I use , https://cosy.com/4thCoSy/Code/CoSy/sort.f , for floats
.
The fundamental algorithm , while impressively elegant has a more
pervasive problem that it is not order preserving on identical elements
. That means if sorting a table using the permutations associated with a
sort , the equal elements in sorting of a later column can disrupt the
order of previously sorted columns .
APLs & K only return the sorting permutation , aka ` grade , rather
than the sorted list itself . I've yet to figure out how to get the
permutation ( to apply to all columns in a table ) w/o doing the sort
itself .
In any case , if anyone has an order preserving sort which applies to
floats as well as ints & strings , let us know .
Seems to me the only time sorting meant anything was to prepare data to
be saved to a GCR tape. Even a tape-to-tape merge/purge from ancient
times is essentially a source to destination sort only having to
actually sort the introduced data. So let's modernize ... Loading the
table elements, given a data standardization exists, into a b+tree index
solves the issues. 1) Never have to sort in the first place. 2) Add or
remove data any time. 3) Drill down ability. 4) On an Intel system
the space is trivial. 5) Order added is preserved on duplicates. 6)
Why are you sorting in the first place? 7) Is there a real need to
sort? Well, besides a human data presentation?
This comment comes on following an understanding of temperature
measurement and dealing with the associated instrumentation, say an 'S'
type thermocouple. The linearization process is to keep performing a
7th order polynomial against the acquisition data or calc it once and
have a lookup table based on said values. Well, CM would say "Calc it",
as memory was rather expensive and limited 'in the day', not so much
now, "as there is likely enough processor to do so". When you have 32
bit processors that can run on a 3V lithium coin cell for a decade+
while transmitting sub-GHz data every minute or so the entire while, the
battery will die of old age before you deplete it, energy usage is now
paramount. Maybe you need to factor your concept on needing to sort ...
Sorting seems from days past a way to benchmark a system ... And only thus.
Just put the data into the index in the first place. The index is the
table. Forward, backward, subsets, content, selective resolution,
duplicate check, expansion and contraction, you decide the data
selection. Sorting ... LOL
You can't give hard and fast rules with respect to sorting in
general.

One of my jobs was to sort (actually merge unsorted data unto) to the
most prestigious Dutch dictionary (de dikke van Dalen). 1]
1. It makes no sense to have an index into this kind of jobs.
A dictionary is a stream of characters, only during sorting it is
considered a heap of records.
2. The rules for comparison were so complicated that the program code
for comparison run over several pages. They went to great length to
prevent a situation that paragraphs sorted equal.
So "stable sorting" ... LOL

1] This was an overnight job on the PDP. My job was to create an VAX
program in c exactly equivalent to the previous assembler program.
It run in minutes, and was actually io-bound.

Groetjes Albert
--
Don't praise the day before the evening. One swallow doesn't make spring.
You must not say "hey" before you have crossed the bridge. Don't sell the
hide of the bear until you shot it. Better one bird in the hand than ten in
the air. First gain is a cat spinning. - the Wise from Antrim -
Hans Bezemer
2023-09-24 12:31:00 UTC
Permalink
It's been too long since I checked comp.lang.forth . comp.lang.apl has too little activity to bother with & I didn't realize that is far from true here .
Anybody got a good ` sort algorithm ?
I got plenty. They come in two flavors, address based and index based. The address based ones work with any structure, as long as they exist as pointer arrays. The comparison uses a call back function:

Algorithm Library
Slow sort slowsort.4th
Stooge sort stoosort.4th
Cycle sort cyclsort.4th
Pancake sort pancsort.4th
Radix sort LSB radxsort.4th
Bubble sort bublsort.4th
Cocktail sort cocksort.4th
Cocktail sort
(improved) coc2sort.4th
Simple sort simpsort.4th
Simple sort
(improved) ismpsort.4th
Selection sort selcsort.4th
Insertion sort instsort.4th
Insertion sort
(improved) ins2sort.4th
Binary Insertion sort binssort.4th
Oyelami sort (MDIS) oyelsort.4th
Circle sort circsort.4th
Circle sort (improved) cir2sort.4th
Bitonic sort bitosort.4th
Merge sort mergsort.4th
Odd-even merge sort odevsort.4th
Tim sort (simple) timsort.4th
Shell sort shelsort.4th
Shell sort (A033622) shelsort.4th
Shell sort (A108870) shelsort.4th
Comb sort com2sort.4th
Heap sort hea2sort.4th
Binary Quick sort binquick.4th
Intro sort intrsort.4th
Quick sort qsort.4th
Quick sort
(unsafe) qsort.4th

The index based ones work with any structure as well, as long as they're random accessible (by using an index). Both the compare and the exchange words are callbacks:

Algorithm Library
Heap sort heapsort.4th
Comb sort combsort.4th
Selection sort sel2sort.4th
Bubble sort bub2sort.4th
Gnome sort gnomsort.4th
Gnome sort (improved) gno2sort.4th

They're written for 4tH, so some assembly may be required. You can find them here: https://sourceforge.net/p/forth-4th/code/HEAD/tree/trunk/4th.src/lib/

Hans Bezemer
Loading...