... I strongly agree. Cheers, Elizabeth -- ================================================== Elizabeth D. Rather (US & Canada) 800-55-FORTH FORTH Inc. +1 310.999.6784 5959 West Century Blvd. Suite 700 Los Angeles, CA 90045 http://www.**--****.com/ "Forth-based products and Services for real-time applications since 1973." ==================================================
Not with my grep: ~/proj/gforth> grep '$=' $(find . -name '*.fs') ~/proj/gforth> Maybe there was some code in regexp.fs, which had a word $= at some time, but lost it in the meantime. -- Bernd Paysan "If you want it done right, you have to do it yourself!" http://www.**--****.com/ ~paysan/
Op Tue, 23 Nov 2010 16:20:37 +0100 schreef Bernd Paysan: Version 0.7.0 has it in regexp.fs. Once as its definition, once used in ,=" Replacing it with `tuck compare' is no big deal. -- Coos CHForth, 16 bit DOS applications http://www.**--****.com/
In article < XXXX@XXXXX.COM >, <SNIP> In the library of ciforth there is $= ( adr len adr' len' - flag) It shortcuts if the lenght are unequal, then does a byte compare (CORA) over the length on both addresses. Groetjes Albert
In article < XXXX@XXXXX.COM >, <SNIP> In the library of ciforth there is $= ( adr len adr' len' - flag) It shortcuts if the lenght are unequal, then does a byte compare (CORA) over the length on both addresses. Groetjes Albert
Any COMPARE that does so is nonstandard. The specification of COMPARE is straightforward, and does not permit such things. Andrew.
Andrew Haley < XXXX@XXXXX.COM > writes: Demonstrate it, because if one follows words in (draft) standard, your understanding doesn't follow. -- HE CE3OH...
Well, let's see. "The strings are compared, beginning at the given addresses, character by character, up to the length of the shorter string or until a difference is found." So, each character in s1 is compared with the corresponding character in s2 until a difference is found. There is no permission, for example, to treat a pair of characters in s1 as being the same as a single character in s2. A character type is a numeric type (A.3.1.2), so any value from 0..255 may be put into a string and COMPAREd. There is no restriction to printable ASCII. Therefore, there is no way to interpret this text as anything other than comparing two arrays of bytes, byte-for-byte. (Where a byte is the size of a char, not necessarily an octet.) Andrew.
Remember that standards specify *minimum* requirements. Any COMPARE has to work according to this description in order to be standard. I think it would be perfectly *legal* for a COMPARE to do such things as Alexei suggests, but any program that depends on that enhanced behavior would obviously have a dependency on it. Cheers, Elizabeth -- ================================================== Elizabeth D. Rather (US & Canada) 800-55-FORTH FORTH Inc. +1 310.999.6784 5959 West Century Blvd. Suite 700 Los Angeles, CA 90045 http://www.**--****.com/ "Forth-based products and Services for real-time applications since 1973." ==================================================
> FORTH Inc. 1>310.999.6784 > 5959 West Century Bl>d. Suite 700 > Los Angeles, CA 90045http://>ww>forth.com > > "Forth-based products and Services >or real-time > applications>since 1973." > ================================================== I would not expect aeroplane to match oplane without a locale, and the standard contains no such imprecation.
>>>> Andrew. >> Remember that standards specify *minimum* requirements. ny COMPARE has >> to work according to this description in order to be standard. think >> it would be perfectly *legal* for a COMPARE to do such things as Alexei >> suggests, but any program that depends on that enhanced behavior would >> obviously have a dependency on it. Yes, and it seems the minimum requirement is that if the character ~ .... given that: char >> +n >> n >> x, and char >> +n >> u >> x .... ~ is to be compared to the length of the shorter or "until a difference is found", it seems like c1 not different from c2 implies that as signed numbers n1 not different from n2 and as unsigned numbers u1 not different from u2, since c1 is a refinement of n1 or u2, not the other way around. Generically ICOMPARE would be a thing on a string of xchars, and generically the ISTR= benefit of testing the length first may well go away, since the length of a string of xchars is the number of pchars in the string, not the number of xchars in the string.
I accept the principle, but I do not believe that the specification for COMPARE has sufficient wiggle room to allow it in this case. A character can be anything between 0..255, as the standard says. Andrew.
1.RfD: String comparison words version 1 (was version 0)
2.RfD: String comparison words version 0
RfD: String comparison words Change history 2010-11-05 Initial proposal, incomplete Problem ------- Although ANS Forth provides COMPARE for string comparisons, it has two attributes that make it problematic; it is case sensitive and expensive to execute due to the range of return values. Extend the functionality of COMPARE to provide comparisons that are case- insensitive, and comparisons that test for only equality. STR= ( c-addr1 u1 c-addr2 u2 -- n ) STRING-EXT Compare the string specified by c-addr1 u1 to the string specified by c-addr2 u2. c-addr1 and c-addr2 point at read-only areas, which must not be modified. If the strings are of different lengths (u1 is not equal to u2), n is zero (0). Otherwise the strings are compared, beginning at the given addresses, character by character, up to the equal length of the strings or until a difference is found. Characters are considered identical if they have the same numeric value. If the two strings are identical, n is zero. ISTR= ( c-addr1 u1 c-addr2 u2 -- n ) STRING-EXT Compare the string specified by c-addr1 u1 to the string specified by c-addr2 u2. c-addr1 and c-addr2 point at read-only areas, which must not be modified. If the strings are of different lengths, n is zero (0). If both strings are null (u1 and u2 are both zero), n is one (1). Otherwise the strings are fetched, beginning at the given addresses, character by character. Characters are considered identical if they have the same numeric value, or if the characters fall between ASCII values 'A' thru 'Z' they are considered to be identical to the corresponding character values in the range 'a' thru 'z'. If the two strings are identical, n is one (1); otherwise n is zero (0). Remarks ------- Why standardize these words? They can be defined in ANS Forth; : STR= COMPARE 0= ; : ISTR= <definition required> ; For the following reasons: * Many systems define STR= or ISTR= (or the equiavlent with another name) * Case insensitive Forths require such a word to search the dictionary in a case-insensitive manner, and many expose these words or their equivalents to the user. * The commonest use of COMPARE is in the form COMPARE 0=. * They will be implemented more efficiently on many systems. Although string manipulation and handling is not employed extensively, text processing applications benefit significantly. Why no case-insensitive COMPARE? While 'a' and 'A' can be considered equal, it is problematic to assign a meaning to a comparison of 'a' against 'B' in terms of 'greater than' or 'less than'. Numerically, 'B' (65 decimal) is less than 'a' (96 decimal), but collation sequences are normally defined in terms of case-insensitive tests; 'A' precedes 'ab', which precedes 'B'. This RfD does not attempt to address these issues. Note that the implementation of STR= and ISTR= does not describe the values of c-addr1 or c-addr2 when u1 <> u2 (unequal length strings), or when u1=u2=0 (null strings). Given that different implementations may address these in their own way, supplying invalid values of c- addr1 and c-addr2 in those cases (those that would cause an error if a single character was fetched from either of those addresses) is an ambigous condition. Experience ---------- As a case insensitive Forth, Win32Forth exposes ISTR= , used to search wordlists, as defined here. <others>
3.RfD: String comparison words version 1
Alex McDonald < XXXX@XXXXX.COM > writes: > RfD: String comparison words (Draft version 1) > > Change history > 2010-11-05 Initial proposal, incomplete > 2010-11-22 Expanded Remarks section > Expanded Experience section > Correction of errors > > Problem > ------- > > Although ANS Forth provides COMPARE for string comparisons, it has two > attributes that make it problematic; it is case sensitive, and > expensive to execute for equality or inequality (the common cases) due > to the need to complete the comparison to return greater-than or less- > than return values. > > Add to the existing functionality of COMPARE to provide comparisons > that are case-insensitive and and that only test for equality. > > STR= ( c-addr1 u1 c-addr2 u2 -- flag ) STRING-EXT I still don't like standard words being contractions when there's no problem using full form. "STRING=" is much better by all means except length, and it is only 3 letters longer. You don't know if programmer wants to use "STR" prefix for dynamic strings or not. > Compare the string specified by c-addr1 u1 to the string specified by > c-addr2 u2. c-addr1 and c-addr2 point at read-only areas, which must > not be modified. If the strings are of different lengths (u1<>u2), > flag is FALSE. Otherwise the strings are compared, beginning at the > given addresses, character by character, up to the equal length of the > strings or until a difference is found. If the two strings are > identical, flag is TRUE. "Identical" or "equal"? "Identical" means that objects are the same, "equal" means that objects are allowed to be different still equal. Look up "object identity" in programming literature. > ISTR= ( c-addr1 u1 c-addr2 u2 -- flag ) STRING-EXT > > Compare the string specified by c-addr1 u1 to the string specified by > c-addr2 u2. c-addr1 and c-addr2 point at read-only areas, which must > not be modified. If the strings are of different lengths, flag is > FALSE. Otherwise the strings are fetched, beginning at the given > addresses, character by character. Characters are considered to match > if they have the same numeric value, or, if the characters fall > between ASCII values 'A' thru 'Z', they are considered to be identical > to the corresponding character values in the range 'a' thru 'z'. This is unacceptible since it ignores natural string comparison rules. > If > the two strings are identical, flag is TRUE, otherwise FALSE. Same as above. > Remarks > ------- > > Why standardize these words? They can be defined in ANS Forth, for > example; > > : STR= COMPARE 0= ; > > For the following reasons: > > Most uses of COMPARE are for string equality or inequality for string > prefices. > > Using COMPARE to test for inequality is inefficient, as strings with > unequal lengths can immediately be declared unequal; but COMPARE must > continue to fetch and check characters to determine whether it should > return greater-than or less-than, even though the result of this > additional work will be discarded. > > For strings of equal length, the overhead is less significant, but the > result of the comparison must still be adjusted to indicate the > required result. These are very weak arguments since all these cases are eliminated with primitive peephole optimiser. > Although string manipulation and handling is not employed extensively, > text processing applications benefit significantly. Letting the > compiler optimize uses of COMPARE 0= into a more efficient word is > possible, but the programmer must employ an expensive COMPARE followed > by tests to reduce the range of the result on systems that do not > synthesize more efficient tests for equality. > > Case insensitive Forths require words to search the dictionary in a > case-insensitive manner. These tests and tests for prefixes require > that the tested argument is either converted to all upper case (or all > lower case), which generally requires copying the original string to a > transient area and performing a suitable case translation, followed by > an expensive COMPARE operation. > > Why no case-insensitive COMPARE? > > There are a wide variety of case-insensitive words employed by Forths > for this function; ICOMPARE, COMPARE(NC), UCOMPARE amongst others. > Standardising such widely varying words would be problematic. Standardising STRING= STR= S= $= and so on is still problematic, yet you're writing this proposal somehow. Standardising case insensitive COMPARE would solve remaining string comparison problems. > Why no STR<, STR>, STR>= and so on? > > The implementation of any test beyond equality requires inspecting all > the characters for the length of the shortest. Proper implementation of equality test for strings requires inspecting all the characters anyway. > The differentiation > between greater-than and less-than is trivial for implementations of > COMPARE to determine, as it is set on meeting the first non-equal > character, or on exhausting one or other of the strings. All of these > variants can be efficently written using COMPARE. > > : STR< COMPARE 0 > ; > : STR> COMPARE 0 < ; > : STR>= COMPARE 1 < ; > > and so on. > > The current proposal does not allow the synthesizing of case > insensitive comparisons due to a lack of appropriate ICOMPARE (or > COMPARE(NC) etc). And this is major drawback of this proposal. > Experience > ---------- > > As a case insensitive Forth, Win32Forth exposes ISTR= , used to search > wordlists, as defined here, and supplies a STR= not based on COMPARE. > > MPE's VFX Forth supplies STR= S= and IS=. S= is a buffer compare with > the signature ( c-addr1 c-addr2 u -- flag ); IS= is the case > insensitive equivalent. S= and IS= can be efficiently synthesized from > STR= and ISTR= respectively; > > : S= ( c-addr1 c-addr2 u -- flag ) TUCK STR= ; > : IS= ( c-addr1 c-addr2 u -- flag ) TUCK ISTR= ; > > [ Does VFX Forth provide an equivalent to ISTR=? ] > > Gforth supplies STR= STR< and STRING-PREFIX?. > > STRING-PREFIX? can be synthesized from STR= ; > > : STRING-PREFIX? ( c-addr1 u1 c-addr2 u2 -- flag ) > TUCK 2>R MIN 2R> STR= ; > > [ Information on other Forths required here ] Create it. There're more freely available Forths than Win32Forth and Gforth, some of them are portable. > Comments > -------- > > The ANS definition of COMPARE does not explicitly declare whether the > input strings are read-only. Since COMPARE states that characters are > "compared", the assumption is that they are read-only since no > reasonable implementation needs to employ a destructive test. This makes grounds to review the practice and amend standard to require non-destructive comparison. > With > case-sensitive string comparisons, this RfD makes it clear that they > are read-only, as implementors might be tempted to lower- or upper- > case one or both of the strings prior to comparison. > > Note that the implementations do not assign a meaning to the values of > c-addr1 or c-addr2 when u1<>u2 (unequal length strings), or when > u1=u2=0 (null strings which always return TRUE). Given that different > implementations may address these in their own way, supplying invalid > values of c-addr1 and c-addr2 in those cases (those that would cause > an error if a single character was fetched from either of those > addresses) is an ambiguous condition. > > Case-insensitivity only considers ASCII 'A' thru 'Z' to be equal to > the corresponding ASCII characters 'a' thru 'z'. No other characters > outside that range are considered equal. This is definitly wrong. If your words are not useful for anything except internal problems of your implementation and your programs, they should not be standardised at all, let alone take useful names. -- HE CE3OH...
4.RfD: Escaped Strings version 4
5.RfD: Escaped Strings S\" (version 5)
6. RfD: Escaped Strings (Version 6.2)
Users browsing this forum: No registered users and 90 guest