RfD: String comparison words version 0

forth

Re: RfD: String comparison words version 0

Postby Bernd Paysan » Mon, 08 Nov 2010 06:27:56 GMT



Once you go really beyond ASCII, case insensitive compare becomes a 
nightmare.  In German, you have to treat =ss, and probably =ae, =oe, 
=ue; not that much.  In Chinese, you probably have to treat traditional 
and simplified Chinese equal, since that's just two ways of writing the 
same character (pre-unicode Chinese encodings "solve" the problem by 
being either simplified or traditional, and never both).

Furthermore: since Unicode allows several ways to write accented 
letters, you may have to map them all together (e.g accent+e vs. ).

However, the reason why we have case-insensitive Forth systems is that 
Forth originally had all words in capital letters, and nobody today 
wants to shout all the time when programming.  So people now write 
mostly lower case, and the case insensitivity is for backward 
compatibility only.  The character set that matters here is ASCII only.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
 http://www.**--****.com/ ~paysan/

Re: RfD: String comparison words version 0

Postby BruceMcF » Mon, 08 Nov 2010 07:22:43 GMT



And converting all upper case ASCII dictionary entries to all lower
case, and all upper case ASCII tokens to all lower case, and using
case sensitive compare, also works. BASE above 10 must handle the
multi-case and BASE is limited to a range of 36, but I don't like the
punctuation range BASE anyway.

Indeed, using the fig approach of using the place where the token will
end up if it is compiled into the dictionary as the place to store the
token for comparison you can test for all upper case ASCII in the copy
and fix up only those that need it before starting the search, and
then the fix up for the dictionary is already done.

You could equally well convert to upper, but then WORDS ends up
shouting at you.

Re: RfD: String comparison words version 0

Postby Doug Hoffman » Mon, 08 Nov 2010 21:34:06 GMT



...

Excellent.

Should a case insensitive SEARCH be included in this discussion?  I 
would think so.

-Doug

Re: RfD: String comparison words version 0

Postby Andrew Haley » Mon, 08 Nov 2010 22:36:46 GMT



I don't think this argument is really valid: You can define a
case-insensitive COMPARE in terms of a wod that first upcases
everything and then calls COMPARE.  As a factor, this would be more
useful than ISTR= , and for that reason IMO would be a much better
candidate for standardization.

Andrew.

Re: RfD: String comparison words version 0

Postby Coos Haak » Tue, 09 Nov 2010 01:12:06 GMT

Op Sun, 07 Nov 2010 07:36:46 -0600 schreef Andrew Haley:




In don't like the side effect of words that change case of letters under my
nose. In my COMPARE-CI and SEARCH-CI the letters are upcased in the CPU,
not in memory. That may bye slower, but does not change the case of words
in the dictionary.

-- 
Coos

CHForth, 16 bit DOS applications
 http://www.**--****.com/ 

Re: RfD: String comparison words version 0

Postby Andrew Haley » Tue, 09 Nov 2010 01:31:57 GMT




Of course, I'm not suggesting that a case-insensitive COMPARE should
actually change the strings: you just define it as if it were done
that way, on copies of the strings.  

Andrew.

Re: RfD: String comparison words version 0

Postby Alex McDonald » Tue, 09 Nov 2010 02:45:08 GMT


<snipped>


I'm going to make substantial changes to the RfD, in particular to
extend it out to include some of the options discussed by others here,
and to give a longer description of the drawbacks of using COMPARE as
a sole source for other words. It will take a week or so due to other
commitments, but I'd be happy to get some more information on various
Forth implementations of string comparison words and the names used,
any experiences and other relevant comments.

Re: RfD: String comparison words version 0

Postby Alex McDonald » Tue, 09 Nov 2010 07:03:08 GMT




[heavily snipped]


I'd be interested in any research that supports this assertion.



It was an attempt at humour. It failed.


What evidence supports this assertion?


Perhaps you misunderstand what I am trying to propose here. I will
clarify the RfD.


I was looking for Forths that exposed either ISTR= or an equivalent
word. I have MPE VFX Forth, gForth, Win32Forth, SP-Forth; are there
others? pForth only supports CMOVE CMOVE> and COMPARE from the string
wordset.


DragonForth has seen no development since 2004-01-20. SP-Forth
supports COMPARE-U. Future support for UCOMPARE doesn't make any
sense; what am I to do with that information?




Re: RfD: String comparison words version 0

Postby Albert van der Horst » Tue, 09 Nov 2010 09:38:32 GMT

In article < XXXX@XXXXX.COM >,





My base word is CORA : compare area.

It compares ADDRES1 and ADDRES2 over N address units (bytes).
That looks more like a basic building block.
It returns the difference from the first differing bytes, or zero.
(And of course it is a single instruction with REP prefix on the
Pentium.)

Groetjes Albert



Return to forth

 

Who is online

Users browsing this forum: No registered users and 66 guest