r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Ladislav
1-Nov-2011
[5917]
Using a Hash! contributed a lot to Ladislav's speed -- when I tried 
it as a Block! it was only slightly faster than Geomol's.....What 
a pity R3 removes hash!
 - no problem, in R3 you can use map!
Sunanda
1-Nov-2011
[5918]
That's true,  but map! isa bit awkward for just looking up an item 
in a list.....Map! is optimised for retrieving a value associated 
with a key.
Ladislav
1-Nov-2011
[5919x4]
as follows: 

entities-map: make map! []
foreach entity entities-block [entities-map/:entity: true]
so, keys are the entities, and the value is either true (for an entity) 
or none
I think it is OK that way
Another solution is to use a sorted block and a binary search, which 
should be about the same speed as hash
Sunanda
1-Nov-2011
[5923]
Yes, it is doable with map! -- but, as I said awkward.


Another issue (or perhaps just unfixed bug) is the lack of case sensitivity 
with map!
    select/case make map! ["A" true] "a"
     == true

The current work-around is to use binary rather than string data:
    select make map! reduce [to-binary "A" true]  to-binary "a"
    == none
Ladislav
1-Nov-2011
[5924x3]
yes, right, that is an issue
BTW, I think, that there is a possible optimization not using the 
charset you mention
Are you still interested?
Sunanda
1-Nov-2011
[5927]
Yes please!
Ladislav
14-Nov-2011
[5928x4]
Sorry for not continuing with it, Sunanda, but when I gave it a second 
thought, it did not look like a possible speed-up could be worth 
the source code complication.
Another Parse discussion subject:


It looked to me like a good idea to be able in one Parse pass to 
sometimes match some strings in a case-sensitive way and other strings 
in a case-insensitive way. This is not possible using the /CASE refinement, 
since the refinement makes all comparison case sensitive, or if not 
used, all comparisons are case insensitive. Wouldn't it be good to 
be able to adjust the comparison sensitivity on-the-fly during parsing?
I think, that it should not be overly complicated to achieve the 
goal e.g. by using a CASE keyword in PARSE.
(for switching to case-sensitive mode, and e.g. a NO-CASE for switching 
to case-insensitive mode)
BrianH
14-Nov-2011
[5932x4]
How about a CASE operation that applies to the next rule, which could 
be a block? No NO-CASE operation required, and better to integrate 
with backtracking.
It would be a modifier, like OPT or 1.
While we're at it, the KEEP operation from Topaz would be useful. 
I use PARSE wrapped in COLLECT, calling KEEP in parens, quite a bit.
You'd miss the /into option for incremental collecting and preallocation, 
but at least you wouldn't need to BIND/copy your rules.
Ladislav
14-Nov-2011
[5936]
How about a CASE operation that applies to the next rule, which could 
be a block? No NO-CASE operation required
 - that is an error, even in that case you *would* need NO-CASE
BrianH
14-Nov-2011
[5937]
OK, but you wouldn't need NO-CASE to end a CASE. It would be another 
modifier, not a mode. Modes like that don't work with backtracking 
very well. So it would be like this:
	case ["a" no-case "b" "c"]
not like this:
	case "a" no-case "b" case "c" no-case
The two directives would be implemented as flags, like NOT.
Ladislav
14-Nov-2011
[5938]
OK, but you wouldn't need NO-CASE to end a CASE.

 -  What I did propose was just the existence of such keywords, the 
 exact implementation should be the one that is the simplest to implement, 
 which may well be the one you mention.
BrianH
14-Nov-2011
[5939]
OK, cool. You have to be careful with the "mode" term though. That 
tripped up some of the last round of parse proposals, such as REVERSE.
Ladislav
14-Nov-2011
[5940]
Hmm, REVERSE has more issues, I think
BrianH
14-Nov-2011
[5941]
The biggest of which is that it hasn't been implemented yet :(
Ladislav
14-Nov-2011
[5942x2]
Well, I am not pushing for it.
But, CASE should be a simpler case ;-)
BrianH
14-Nov-2011
[5944]
I liked it at the time, at least the bounded modifier version, but 
of the unimplemented proposals it's not my highest priority.
Ladislav
14-Nov-2011
[5945]
OK, so, do you think I should put the CASE proposal (mentioning your 
variant) to the article?
BrianH
14-Nov-2011
[5946x4]
Sure :)
We really should go over that article and note which of the proposals 
was implemented, in which version, and which were denied and why.
article -> page
It's especially important to document the denied proposals, since 
the reasons for their denial would be instructive.
Ladislav
14-Nov-2011
[5950]
Will have a look, and, will also use one ticket to let Carl know.
BrianH
14-Nov-2011
[5951]
What do you think of the KEEP operation from Topaz? A good idea, 
or out of scope for PARSE?
Ladislav
14-Nov-2011
[5952x2]
BTW, the limitation of CASE to just the next rule is not exactly 
necessary. I would like to point you e.g. to the description of the 
#localize-on #localize-off user-defined directive pair, which is 
defined so, that it will not have any problem with multitasking or 
recursion, yet the directives are not limited to just the subsequent 
value. (Robert plans to publish the source code and the documentation 
soon)
Regarding a KEEP keyword: may be a reasonable addition. I surely 
prefer KEEP, when choosing between KEEP and CHANGE.
BrianH
14-Nov-2011
[5954x3]
I would definitely not make that choice. I need CHANGE too, and the 
full version with the value you're changing to be an expression in 
a paren - the last part of the proposal that isn't implemented yet. 
That's at the top of my list.
Ladislav, multitasking and recursion is not the same thing as backtracking. 
We already have backtracking bugs, we don't need to mandate more.
(bad English grammar day)
Ladislav
15-Nov-2011
[5957x4]
I need CHANGE too, and the full version with the value you're changing 
to be an expression in a paren

 - this changing during parsing is known to be O(n), i.e. highly inefficient. 
 For any serious code it is a disaster
Anyway, I am happy this does not influence my code
Regarding CASE and backtracking: it is not a problem when the effect 
of the keyword is limited to the nearest enclosing block.
(which is exactly the case of the #localize-on / -off directives 
as well)
BrianH
15-Nov-2011
[5961x2]
O(n) isn't bad if n is small, especially compared to other parts 
of the process. Most of my apps are bound by database or filesystem 
speed.
Backtracking often happens within blocks too, but yes, that does 
limit the scope of the problems caused (it doesn't eliminate the 
problem, it just limits its scope). Mode operations also don't interact 
well with flow control operations like OPT, NOT and AND. What would 
NOT CASE mean if CASE has effect on subsequent code without being 
tied to it? As a comparison, NOT CASE "a" has a much clearer meaning.
Gregg
15-Nov-2011
[5963]
I like the idea of a CASE option. There haven't been many times I've 
needed it, but a few. Other things are higher on my priority list 
for R3, but I wouldn't complain if this made its way in there.
Ladislav
15-Nov-2011
[5964]
Hmm, to not complicate matters and hoping that it is the simpler 
variant I modified the CASE/NO-CASE proposal to use the

    CASE RULE

and

    NO-CASE RULE


syntax, since it really looks like simpler to implement than other 
possible alternatives.
Endo
1-Dec-2011
[5965]
I want to keep the digits and remove all the rest,

t: "abc56xyz" parse/all t [some [digit (prin "d") | x: (prin "." 
remove x)]] print head t

this do the work but never finish. If I add a "skip" to the second 
part the result is "b56y".
How do I do?
Geomol
1-Dec-2011
[5966]
Alternative not using parse:

>> t: "abc56xyz"
== "abc56xyz"
>> non-digit: ""
== ""
>> for c #"a" #"z" 1 [append non-digit c]
== "abcdefghijklmnopqrstuvwxyz"
>> for c #"A" #"Z" 1 [append non-digit c]
== {abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ}
>> trim/with t non-digit
== "56"