ordered associative arrays

forth

Re: ordered associative arrays

Postby seeWebInstead » Sat, 13 Nov 2010 14:49:48 GMT

gt; From: Hugh Aguilar < XXXX@XXXXX.COM >

Then perhaps you'd be interested in my initiatives:
- Hierarchial&mixin intentional types and their specific implementations.
- No-syntax software development via a menu+keyword-driven IDE.


You might also like PHP for similar reasons. Associative arrays are
a built-in part of the language. Using them, it's almost trivial to
implement a wide variety of intentional datatypes.


Thanks for clarifying by confirming my guess.


I don't believe that. All I know about forth is:
- Each word in sequence performs an immediate action on the stack,
except some words start special modes such as defining new words
or terminating such a definition.
- swap and dup.
- Literal numeric constants self-push on the stack.
- Arithmetic operators combine the top two items on the stack and
push the result in their place.
I know there are about a hundred words defined in Pocket Forth, the
only version of Forth available to me here, but I don't remember
any of them specifically, except as listed above. I don't even know
how to define a function.

I wish I knew somebody who could tell me how to use MacPPP from Forth.


So you want the order-relation (i.e. the sorting predicate) to be
implicit in the datatype, like what Java terms the "natural" order?

There are actually two ways to do it:
- Have a generic container, but the keys are in a class that has
the order relation. All the keys in any given container must be
in the same class, or the container can't auto-sort.
- Have an order-relation specific to each class of container, thus
the same exact keys in two different classes of container would
auto-sort differently.


Only if the order-relations for the two classes (of keys, or of
containers) are the same, i.e. the keys are of the same class (with
its natural ordering), or the containers are of the same class
(ditto). If you try to merge two generic containers that happen to
contain keys from different classes, or if you try to merge two
order-specific containers from different-ordered classes, it isn't
going to work.


Why? Floating-point numbers are supposed to be mere approximations,
shots-in-dark, not exact values that have well-defined order. If
you compute exactly the same mathematical value using two different
values, roundoff error could resut in different floats, thus the
same mathematical value would be less than itsef. Contrarywise,
with fixed-precision floats, there are only a finite number of
different floats, hence with an infinite number of possible
mathematical values at least two different values will float the
same.


"White man speak with forked tongue."
What you said there is meaningless.
"Just like an array" implies:
- Whenever you insert a new element, or delete an old element, all
entries after that point must be *moved* forward or backward to
make room for the new element or fill the gap from the old
element respectively.
- Indexes of an array are a contiguous segment of number space.
Whenever you insert/delete you must change the keys of each of
the elements that were moved to get out of the way of the new
element or fill the void from the old element.
From what else you write, I'm quite sure you do *not* want that behaviour.

If you said "just like a sparse array", that would make more sense.
Sparse arrays are typically implemented as self-balancing search
trees, either b

Re: ordered associative arrays

Postby ron » Wed, 17 Nov 2010 14:12:06 GMT



This is why you will never be a professional programmer, Hugh.  By
your own admission you "never research".  By your demonstrated actions
here, you never test alternative hypotheses or test your own.  You
program by "assertion".

You have been told countless times, and it has been *demonstrated*,
that there are better structures for symbol tables than the one you
came up with.  Yet you continue to "assert" the superiority of your
solution, without one single *fact* to back it up.  Do you understand
why that is a problem?


Nobody "gets a kick" out of telling you anything, Hugh.  Before you
can progress as a programmer, you need to show some level of maturity
and accept criticism (even if unfortunately stated at times).  If one
person calls you a donkey, you can ignore it.  But if it becomes a
commonplace, perhaps you should invest in a bridle?

Tell me something:  would you be happy going to a medical doctor who
"never did research", and just asserted that "this treatment is the
best"?  Why or why not?

Re: ordered associative arrays

Postby Hugh Aguilar » Sun, 21 Nov 2010 10:05:14 GMT

n Nov 18, 12:00m, XXXX@XXXXX.COM (Robert Maas,
http://tinyurl.com/uh3t) wrote:

Certainly there have been advances in programming --- not too many
people are using Hollerith cards these days! The important point
however, is that all of these advances allowed programmers to program
more easily than they had done previously --- but they were still
programmers.

The reason why I'm turned off by point-and-click "programming" is that
the primary motivation here is to allow non-programmers to program,
without ever requiring them to learn anything. There are a lot of MBA
managers who realize that they don't know *anything* about
programming, but who also wish that they didn't have to pay those
pesky programmers' salaries, so they ask: "Why isn't there some
software available that generates computer programs, and all I will
have to do is describe what I want the program to do, and the program
will be handed to me on a silver platter?" They basically want
Aladdin's Jinni to build their palace for them --- this is magic, not
technology.

I'm not saying that you won't be able to make an improvement in
software development --- improvements have been made before, and will
continue to be made --- I'm just saying that if you try to please the
MBA types, you will never succeed, because you're not a Jinni.


I don't have any money, so I couldn't invest in it even if I did think
that it would work. Also, I'm still just working my way through
"Practical Common Lisp," so I'm not prepared to help with the code
either.


Most PLC software is proprietary, and specific to a PLC board. Also,
I've never used a PLC, so I don't really know much about the subject.

When I worked at Testra, there was some interest there in developing a
PLC front-end for their motion-control board, but nothing ever came of
this. I did learn a little about what PLCs are though. They do offer
quick development compared to using a programming language such as
Forth or C, but they also have limitations --- they will never replace
Forth and C, and they weren't designed to do so.

In programming micro-controllers, there is such a thing as a "timed
loop." The idea is that you have a loop with a delay function on the
end of it, which causes the loop to take *exactly* the same amount of
time for each iteration (for example, 1/60th of a second, or one
"jiffy"). The programmer writes functions that go into the loop, and
he has to make sure that none of these are so time-consuming as to
overflow the alloted time. Often, there will be count-down counters
that cause a particular function to execute once every so many
iterations through the loop. By syncing all of these functions'
counters properly, the programmer can make the machine push and pull
pistons or whatever, without the machine parts hitting each other.
There was a book out by Mototola a long time ago, (I think it was
called: "Programming the MC6805") that described timed-loops for
novices.

PLCs can do what timed loops can do, but you program them using
graphical diagrams (so-called "ladder diagrams") rather than assembly-
language. There are obviously some limitations here as to what you can
accomplish, but within those limitations, PLCs work pretty well.


I'm just barely learning Lisp --- I have sample code in Forth only.

I may be more helpful in the future when I know more Lisp --- until
then, good luck with your work. :-)

Re: ordered associative arrays

Postby Hugh Aguilar » Mon, 22 Nov 2010 05:22:20 GMT

n Nov 18, 1:11m, XXXX@XXXXX.COM (Robert Maas,
http://tinyurl.com/uh3t) wrote:

I didn't invent or name EXECUTE --- Chuck Moore did --- that word has
been with us since the very first Forth implementations of the 1970s.

EXECUTE is a lot simpler than what you are supposing. It doesn't
examine the argument list to determine if the function is matches or
not. There is actually no way in Forth to examine an argument list,
because there is no explicit argument list --- there is just data on
the parameter stack which may or may not be used by the function.
EXECUTE just does an indirect CALL through the xt address. EXECUTE
typically compiles into a handful of machine-language instructions ---
super-simple!

Forth is a *very* low-level language compared to Lisp. To a large
extent, Forth is just a kind of macro-assembler --- all of the Forth
words expand into a handful of machine-language instructions. I
consider Forth to be the best micro-controller language invented, and
I expect this to continue to be true forever. On the other hand, I am
learning Lisp so that I can become a better desktop-computer
programmer --- I no longer believe that Forth is appropriate for that
environment, and I think that Lisp may be best choice, although there
are actually quite a lot of valid choices in that environment that
have their adherents --- it is largely a matter of taste. I'm mostly
choosing Lisp because it has macros, which I am familiar with from
Forth, whereas languages such as Ruby and Python don't.

> > a>>
> The *keep* that total ordering whenever you simplify the data to b>
> more efficient to compare>
> Or if you never have exact values to begin with, onl>
> approximations based on inexact real-world data sampling, the>
> don't try to make a totally-ordered set, instead make an explici>
> partial ordering. For example, if each real-world data value has a>
> exact lower bound and upper bound, you could have a two-leve>
> self-balancing search tree, where the top level sorts into bins pe>
> lower bound, and then inside each bin another level sorts accordin>
> to upper bound, and so a search for items satisfying a query woul>
> have require a two-level nested traversal. Or with the sam>
> lower+upper bound you could have a sort of "radix" classification>
> where values with wide margin were directly located in a cell nea>
> the root while values more accurate would be located within >
> deeper cell that had more digits of precision fixed by its locatio>
> in the hierarchy. This nexted-radix tree would be efficient fo>
> finding all values whose error-interval overlaps with a given exac>
> value or which overlaps with a given error-interval, because you'>
> only have to look in one or two top-level cells and follow only on>
> or two branches down each level until you hit the level where th>
> query interval was wider then the radix intervals.

The problem is that A=B and B=C, but><A<>C. With F~, we are just
measuring the distance between the numbers, so A and C can be too far
apart to be considered equal, although they are both close enough to
the in-between value B to be considered equal to it.

This is easily solved by rounding off each number to a lower-
precision, and then doing an *exact* comparison for equality. If we
determine that the last few digits of the floats are mush, then we
just round-off to a level of prec

Re: ordered associative arrays

Postby Coos Haak » Thu, 25 Nov 2010 01:20:55 GMT

Op Mon, 22 Nov 2010 23:34:34 -0800 schreef Robert Maas,
 http://www.**--****.com/ :

<snip>

man gforth
works both in Firefox and under Linux

Gforth is a good example of a modern, ANS compliant Forth.

Trimmed newsgroups to comp.lang.forth

-- 
Coos

CHForth, 16 bit DOS applications
 http://www.**--****.com/ 

Re: ordered associative arrays

Postby Albert van der Horst » Thu, 25 Nov 2010 04:27:47 GMT

In article < XXXX@XXXXX.COM >,



<SNIP>


Already installed ... Are you happy with one binary in the path
and a library sources in the current directory?
"Installation" is sooo "unsimple".


There are simple Forth on unices.

lina for linux is only 28 kbyte. It is a statically linked binary that
reportedly runs on FreeBSD 386 unaltered. With a library of
300 kbyte source you are good to go.

You can build it on linux:

nasm lina.asm -felf -o lina.o
ld -N lina.o -o lina

OR

as lina.s
ld -N  a.out -o lina

This is hello world:

cat >hello.frt
: hello  ." hello world " ;
^D

lina -c hello.frt

Groetjes Albert

P.S.
Didn't you try  apropos Forth ?

If I ask my debian package manager on Ubuntu about Forth I
get loads of things.


Re: ordered associative arrays

Postby mdw » Fri, 26 Nov 2010 04:25:18 GMT

Hugh Aguilar < XXXX@XXXXX.COM > writes:


It's possible to do this, but not statically.  Suppose you have to write
a program which accepts an integer n as input and outputs the nth
Fibonacci number F(n) as output.  Suppose you know that the input is
bounded above by 2^32.

As a closed form, F(n) = 1/sqrt(5) (phi^n - phibar^n), where phi is the
golden ratio (1 + sqrt(5)/2 and phibar is its conjugate (1 - sqrt(n))/2.
This doesn't lend itself to precise calculation, but it does let us
estimate the size of the result.  Then F(n) < 1/sqrt(5) phi^n; so,
taking binary logarithms, lg F(n) < n lg phi - 1/2 lg 5 < 7/10 n.  This
isn't looking so good for n around 2^32.  I guess we could compute the
size of the result dynamically, but it's hard to see why that would be
an improvement on having the bignum implementation work out how much
space to allocate itself.


I thought that Forth only provided words for single and double precision
integer arithmetic, and not for anything bigger.


I agree.  I like Forth (though I've not played with it very much); it's
probably the lowest level language I know of that has what I'd think of
as being adequate abstraction capabilities.  And I've given some thought
as to how to make a Forth with dynamic typing and garbage collection --
the things which (to my mind) make working in Lisp even more enjoyable
-- and they simply don't fit.  Forth's basic capabilities are tied quite
closely to its simple memory model; introducing dynamic types and
garbage collection complicates the memory model so much that you've
basically lost the Forthness.  You'd have been better off starting with
PostScript.

-- [mdw]

Re: ordered associative arrays

Postby Hugh Aguilar » Fri, 26 Nov 2010 05:59:41 GMT

n Nov 24, 12:25m, XXXX@XXXXX.COM (Mark Wooding) wrote:

Forth has mixed-precision words, such as */ the scalar ( a b c -- n )
that calculates (a*b)/c, with the intermediate value being double
precision. Similarly, we have M*/ ( da b c -- dn ) in which da and dn
are double-precision, and the intermediate value is triple precision.
Going the other way, we have words to divide a double-precision by a
single-precision to obtain a single-precision quotient and remainder.
There are more. There are weird gaps though. We don't have a word for
dividing a double-precision by another double-precision --- I lobbied
for this with the Forth-200x committee, but was ignored as usual. In
my novice package I provide this, but it is slow because it is written
in Forth rather than in assembly-language. I didn't write it --- I got
most of my integer arithmetic code form Nathaniel Grossman's famous
article in the Sept/Oct-1984 "Forth Dimensions." This article provided
a continued-fraction program that is used for calculating rational
approximations of floating-point constants for use by */ and M/*. For
example, here are some I calculated for pi:

macro: pi* ( s -- s-result ) \ signed single-precision
1068966896 340262731 */ ;

macro: upi* ( u -- u-result ) \ unsigned single-precision
3618458675 1151791169 u*/ ;

macro: dpi* ( d -- d-result ) \ signed double-precision
1068966896 340262731 m*/ ;

macro: udpi* ( ud -- ud-result ) \ unsigned double-precision
3083975227 1963319607 ut*/ d2* ;

Most math, including what you described regarding Fibonacci sequences,
is above my head. For the most part, I would expect that unlimited-
size integers would only be used in desktop computer programs. Forth
is primarily used in micro-controllers. We are mostly dealing with 16-
bit data (because that is the precision of ADC and DAC ports), so we
may have to use double-precision intermediate values, or maybe even
triple-precision in some cases. I can't really imagine any use for
unlimited-size integers though. Also, we generally avoid using code
that has a widely varying execution time on a micro-controller,
because the applications are real-time and we have to be careful not
to overrun the time allotment, or chaos will result --- that pretty
much precludes anything fancy like unlimited-precision arithmetic.


What do you mean by "abstraction capabilities?"
>> And I've given some thought >> as to how to make a Forth with dynamic typing and garbage collection -- >> the things which (to my mind) make working in Lisp even more enjoyable >> -- and they simply don't fit. orth's basic capabilities are tied quite >> closely to its simple memory model; introducing dynamic types and >> garbage collection complicates the memory model so much that you've >> basically lost the Forthness. ou'd have been better off starting with >> PostScript.

I've only written one PostScript program in my life, and I learned
PostScript specifically for this (and I stopped reading as soon as I
learned enough for my program). This was the slide-rule program; the
PostScript was generated by a Forth program. I originally wrote the
program to generate gcode for a CNC milling machine (and I still think
this is reasonable for onesy-twosy manufacture), but then I found out
(from Don Lancaster) about photo-lithography, that can be used for
mass-production, so I upgraded to gener

Re: ordered associative arrays

Postby Albert van der Horst » Fri, 26 Nov 2010 13:38:31 GMT

In article < XXXX@XXXXX.COM >,



<SNIP>


If by stack manipulation words you mean ``DUP DROP SWAP ROLL ''
they are called `` dup pop exch roll '' in PostScript.
[Not many people will read this, but I couldn't let this pass. ]

Groetjes Albert


Re: ordered associative arrays

Postby seeWebInstead » Fri, 26 Nov 2010 19:58:09 GMT

> From:  XXXX@XXXXX.COM  (Mark Wooding)

Pocket Forth is a free Forth interactive-interpretor that runs fine
on my Macintosh "Performa 600" (68030-CPU) System 7.5.5.

Is there a free Postscript interactive-interpretor that runs on the
same machine+system? If not, then Postscript isn't even an option
for me here.

Re: ordered associative arrays

Postby seeWebInstead » Fri, 26 Nov 2010 20:20:42 GMT

> From: Albert van der Horst < XXXX@XXXXX.COM >

What's the URL of your demo so that I can try it?


If I'm just clicking on a URL to your demo, none of that matter.
If I'm going to copy your demo to my FreeBSD account and try to get
it running here, ugh, pain, but maybe worth it.


No I can't. I have no access to any linux that has any way to
connect to the net to download lina. (I have a RedHat Linux laptop
whose modem stopped working several years ago, and there's nobody
in Sunnyvale who knows Linux who is willing to help me diagnose the
modem problem, where it refuses to go on-hook because it says it's
already on-hook, and I don't know whether it's a lock file that is
wedged or a bug in the OS or the modem itself burned out.)


Trying it now:
Forth: nothing appropriate
forth: nothing appropriate
cmucl(1)                 - CMU Common Lisp
lisp(1)                  - CMU Common Lisp programming environment
python(1)                - an interpreted, interactive, object-oriented programming language
python: /usr/local/bin/python /usr/local/man/man1/python.1.gz /usr/ports/lang/python

Re: ordered associative arrays

Postby Andrew Haley » Fri, 26 Nov 2010 20:53:04 GMT



I don't think that's actually true: as an example of how you might do
it, HP's System RPL is pretty close.  There is a paper in the Journal
of Forth Application And Research from 1988 that explains the
low-level details of how it was done.  It's very Forth.

Garbage collection should be pretty easy, though.  A conservative GC
can scan the stack and the dictionary, and this is pretty
straightforward to implement.

Andrew.

Re: ordered associative arrays

Postby Doug Hoffman » Fri, 26 Nov 2010 21:44:19 GMT



Mops is another free Forth that will run fine on your Mac.  Mops has a 
very complete dynamic OOP system built in (including multiple 
inheritance).  It may also have garbage collection (the later version 
for OS X definitely has gc, can't remember if the 680x0 version does).

-Doug

Re: ordered associative arrays

Postby mdw » Sat, 27 Nov 2010 05:39:17 GMT

ugh Aguilar < XXXX@XXXXX.COM > writes:


An abstraction mechanism lets you capture the essence of a `pattern',
capture its essence, and then invoke it, explaining only what's unusual
about this particular instance. A function is a common kind of
abstraction, but functions can only abstract over patterns of
computations of values. Making more kinds of things be first-class
values extends the reach of functional abstraction, but there are
limits: e.g., functional arguments usually have to be decorated in some
way to make it clear that their contents aren't to be evaluated
immediately, and the weight of this notation can become prohibitive.

Forth's words are an obvious example of an abstraction mechanism.
`Ordinary' words (I forget the Forth technical term), which just append
execution semantics to a definition in progress, are Forth's analogue to
conventional functions (more or less). You can get a fair way with
these kinds of words, but the limitations would become apparent to
advanced users, at least. But Forth doesn't stop there: immediate words
which combine compile- and execution-time semantics can be used to
produce very powerful effects, and build convenient new languages out of
very primitive parts. (You know all this. I'm writing it anyway, (a)
for the Lisp users' benefit, and (b) as a familiar example of a powerful
abstraction tool.)

Forth, as is used in real life, itself is (or at least can be) such a
language, built from some very simple memory-management tools: I think I
really started to understand where Forth was coming from by reading the
source code to a simple Forth implementation written in i386 assembler.
It seemed to me at the time that the character of Forth arises as a
combination of the powerful abstraction of immediate words, the openness
of the compilation and execution processes, and the simplicity of the
communication mechanisms (the various stacks and the ALLOT memory
space), all of which fit together beautifully to produce an extremely
capable language from almost nothing at all.


I mentioned it as a (presumably) familiar example of a stack-based
concatenative language with a Lisp-like memory model, rather than as a
glowing recommendation; I failed to make this clear, so I'm sorry about
that. My point was really that such things are rather alien to the way
Forth constructs definitions and similar, and that these mechanisms are
fundamental to Forth's character -- so much so that to change them means
that you don't really have Forth any more.


Others have also mentioned Factor, which looks like a better attempt.
I've started reading the documentation, but don't have anything coherent
to say about it yet except to thank you and others for the pointer.


This looks to be in line with my speculation above.

It's probably not clear: I /want/ to be wrong. I'd dearly like to be
able to capture the Forthness of Forth in a richer world of dynamic
objects. And it may be that Factor is the right answer; it certainly
seems like it's worth a look. My own deliberations reached the
conclusion that the two were incompatible in a fairly fundamental way;
maybe I was mistaken, and maybe Factor is the solution I was looking
for.

But I also like Common Lisp, and I certainly wouldn't wish to discourage
anyone from trying or using it.

-- [mdw]

Re: ordered associative arrays

Postby mdw » Sat, 27 Nov 2010 05:39:17 GMT

ugh Aguilar < XXXX@XXXXX.COM > writes:


An abstraction mechanism lets you capture the essence of a `pattern',
capture its essence, and then invoke it, explaining only what's unusual
about this particular instance. A function is a common kind of
abstraction, but functions can only abstract over patterns of
computations of values. Making more kinds of things be first-class
values extends the reach of functional abstraction, but there are
limits: e.g., functional arguments usually have to be decorated in some
way to make it clear that their contents aren't to be evaluated
immediately, and the weight of this notation can become prohibitive.

Forth's words are an obvious example of an abstraction mechanism.
`Ordinary' words (I forget the Forth technical term), which just append
execution semantics to a definition in progress, are Forth's analogue to
conventional functions (more or less). You can get a fair way with
these kinds of words, but the limitations would become apparent to
advanced users, at least. But Forth doesn't stop there: immediate words
which combine compile- and execution-time semantics can be used to
produce very powerful effects, and build convenient new languages out of
very primitive parts. (You know all this. I'm writing it anyway, (a)
for the Lisp users' benefit, and (b) as a familiar example of a powerful
abstraction tool.)

Forth, as is used in real life, itself is (or at least can be) such a
language, built from some very simple memory-management tools: I think I
really started to understand where Forth was coming from by reading the
source code to a simple Forth implementation written in i386 assembler.
It seemed to me at the time that the character of Forth arises as a
combination of the powerful abstraction of immediate words, the openness
of the compilation and execution processes, and the simplicity of the
communication mechanisms (the various stacks and the ALLOT memory
space), all of which fit together beautifully to produce an extremely
capable language from almost nothing at all.


I mentioned it as a (presumably) familiar example of a stack-based
concatenative language with a Lisp-like memory model, rather than as a
glowing recommendation; I failed to make this clear, so I'm sorry about
that. My point was really that such things are rather alien to the way
Forth constructs definitions and similar, and that these mechanisms are
fundamental to Forth's character -- so much so that to change them means
that you don't really have Forth any more.


Others have also mentioned Factor, which looks like a better attempt.
I've started reading the documentation, but don't have anything coherent
to say about it yet except to thank you and others for the pointer.


This looks to be in line with my speculation above.

It's probably not clear: I /want/ to be wrong. I'd dearly like to be
able to capture the Forthness of Forth in a richer world of dynamic
objects. And it may be that Factor is the right answer; it certainly
seems like it's worth a look. My own deliberations reached the
conclusion that the two were incompatible in a fairly fundamental way;
maybe I was mistaken, and maybe Factor is the solution I was looking
for.

But I also like Common Lisp, and I certainly wouldn't wish to discourage
anyone from trying or using it.

-- [mdw]

Similar Threads:

1.Whining (was ordered associative arrays)

2.ordered associative arrays

> From: Hugh Aguilar < XXXX@XXXXX.COM >
> To a large extent, I'm interested in learning CL because it
> offers flexibility. Unlike Python in which there is *one* way to
> do anything, CL offers plenty of choices.

Then perhaps you'd be interested in my initiatives:
- Hierarchial&mixin intentional types and their specific implementations.
- No-syntax software development via a menu+keyword-driven IDE.

> Also, unlike in Forth, I won't have to write everything myself
> from scratch --- I can do something useful, like write an
> application, rather than getting bogged down in implementing
> things like LLRB trees before I can even start on the application.

You might also like PHP for similar reasons. Associative arrays are
a built-in part of the language. Using them, it's almost trivial to
implement a wide variety of intentional datatypes.

> Yes. When I say "region" that is what you're describing as a
> "segment."

Thanks for clarifying by confirming my guess.

> The code (association.4th) is actually pretty short and simple.
> If you know any Forth at all, you could figure it out quickly.

I don't believe that. All I know about forth is:
- Each word in sequence performs an immediate action on the stack,
   except some words start special modes such as defining new words
   or terminating such a definition.
- swap and dup.
- Literal numeric constants self-push on the stack.
- Arithmetic operators combine the top two items on the stack and
   push the result in their place.
I know there are about a hundred words defined in Pocket Forth, the
only version of Forth available to me here, but I don't remember
any of them specifically, except as listed above. I don't even know
how to define a function.

I wish I knew somebody who could tell me how to use MacPPP from Forth.

> No, I don't want to sort by some order-relation (meaning to sort
> by some data in the payload) --- I want the association to be
> *always* sorted by the key.

So you want the order-relation (i.e. the sorting predicate) to be
implicit in the datatype, like what Java terms the "natural" order?

There are actually two ways to do it:
- Have a generic container, but the keys are in a class that has
   the order relation. All the keys in any given container must be
   in the same class, or the container can't auto-sort.
- Have an order-relation specific to each class of container, thus
   the same exact keys in two different classes of container would
   auto-sort differently.

> When two associations are merged together, they necessarily
> result in a new association that is also sorted by the key.

Only if the order-relations for the two classes (of keys, or of
containers) are the same, i.e. the keys are of the same class (with
its natural ordering), or the containers are of the same class
(ditto). If you try to merge two generic containers that happen to
contain keys from different classes, or if you try to merge two
order-specific containers from different-ordered classes, it isn't
going to work.

> In my application, the key will be a floating point number.

Why? Floating-point numbers are supposed to be mere approximations,
shots-in-dark, not exact values that have well-defined order. If
you compute exactly the same mathematical value using two different
values, roundoff error could resut in different floats, thus the
same mathematical value would be less than itsef. Contrarywise,
with fixed-precision floats, there are only a finite number of
different floats, hence with an infinite number of possible
mathematical values at least two different values will float the
same.

> The association will be just like an array, except that the
> indices don't have to be integers.

"White man speak with forked tongue."
What you said there is meaningless.
"Just like an array" implies:
- Whenever you insert a new element, or delete an old element, all
   entries after that point must be *moved* forward or backward to
   make room for the new element or fill the gap from the old
   element respectively.
- Indexes of an array are a contiguous segment of number space.
   Whenever you insert/delete you must change the keys of each of
   the elements that were moved to get out of the way of the new
   element or fill the void from the old element.
From what else you write, I'm quite sure you do *not* want that behaviour.

If you said "just like a sparse array", that would make more sense.
Sparse arrays are typically implemented as self-balancing search
trees, either binary (with the usual need to modify a log(n) path
each time an element is inserted or deleted), or ternary+ (with the
need to skip void spots when traversing, and when you over-fill a
node you just need to split into two not-full nodes under a third
node, never a log(n) re-organization needed).

> In my slide-rule program, I was working with data sets containing
> over 15,000 items.

If you have that many items in a single sorted-container, and you
want insert/delete/merge/copySegment to be efficient, then you need
some kind of self-balancing search-tree, rather than any of the
container types built into Common Lisp.

> I don't want to have to run a SORT function on the data structure
> *every* time that I insert a new item --- I want it to maintain
> itself sorted at all times.

Hence: Self-balancing search tree.

> > By the way, a few years ago I implemented my own SBBST, which
> > balances on the basis of total number of nodes rather than maximum
> > depth of each sub-tree, and allows you to find the nth element in
> > log(n) time which some of the other SBBSTs don't allow.

> I would be interested in seeing your code.

I haven't been paid for my work. I don't have the money to file a
patent on it. The only way I would show the code to anyone is if
he/she meets me in person and signs a non-disclosure agreement. But
if you think of what I already said, that it's a self-balancing
binary-search-tree, based on balancing the number of nodes in each
sub-tree rather than the depth of each sub-tree, there are only a
few additional design decisions needed before the code will be
obvious to any competant software designer. The interesting case is
when one sub-tree is completely empty, under what condition do you
rebalance and otherwise leave as-is slightly unbalanced.

> I find binary trees to be pretty fascinating. I get criticized
> for this a lot because *everybody* knows that hash tables are
> better, but I just *like* binary trees --- sue me!

Hash tables aren't better at all!!!!! Hash tables are *different*.
They do *not* in any way guarantee fast per-insert time. If they
are mis-designed, they might not even guarantee fast amortized
insertion time. And in particular they don't in any way provide a
built-in order relation that can be maintained efficiently, nor a
way to extract segments efficiently.

This question just occurred to me: How long does it take to merge
two self-balancing binary-search-trees (assuming they use the same
order relation)? Is it o(N) where N is the total number of
elements? Or is it o(N * log(N))?

> You might be interested in my symtab package, which uses an
> algorithm I invented myself.

Just curious: Does it rebalance when the sub-tree node-counts are
too lopsided, or when the sub-tree longest-paths are too lopsided?



Return to forth

 

Who is online

Users browsing this forum: No registered users and 30 guest