r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Steeve
8-Nov-2008
[3028x2]
here we go for proposals
Brian i hear you ;-)
BrianH
8-Nov-2008
[3030x2]
It occured to me (as I'm sure that it has occured to others) that 
it is possible for parse rules to do one bad thing even if you exclude 
all of the modification statements, word setting statements, and 
parens: ANY and SOME can go into infinite loops if they don't advance 
the position. I would like to propose that there be some form of 
warning or error if SOME or ANY loop again on the same position they 
did last time. This condition should be screened for with a PARSE 
refinement. If the refinement is set then when the point is reached 
where ANY or SOME would repeat at the same position, the rule would 
fail (and possibly backtrack to the next alternate).
Maybe that and a few other restrictions could be enabled when a /safer 
refinement is used.
Steeve
8-Nov-2008
[3032]
i'm thinking...
BrianH
8-Nov-2008
[3033x3]
Because of get-words there may be times where you don't want the 
position to advance, so this would have to be an option rather than 
standard behavior, or it would be a backwards compatibility problem 
that might not be worth it.
The way the new behavior would be formulated is this: ANY or SOME 
would only succeed if one of these conditions happened:
- The rule argument fails (after the first round for SOME).
- The rule argument succeeds *and* the position changes.
I'm not sure how REVERSE would fit in, but it sounds workable so 
far...
Steeve
8-Nov-2008
[3036x2]
i never had such a case. I don't really see your point. When all 
rules failed in an ANY block, then we have a break
it's the responsability of sub-rules to do some skip to avoid such 
cases
BrianH
8-Nov-2008
[3038]
Ah, but if one of the rules succeeds but doesn't advance the position 
(like NONE), you get an infinite loop.
Steeve
8-Nov-2008
[3039]
yes but who do such things ?
BrianH
8-Nov-2008
[3040]
The reason you would enable this option is to catch sub-rule logic 
errors.
Steeve
8-Nov-2008
[3041x2]
i always take care to advance in the serie
ah ok, it's to help for debugging purposes ?
BrianH
8-Nov-2008
[3043]
Yeah, or malicious rules from third parties because you can't easily 
statically determine whether the bug would happen. Most malicious 
rules can be screened for, but others can't. It's a bad idea to run 
third-party rules anyway, but some people can't avoid it.
Steeve
8-Nov-2008
[3044]
but i see the interest to set no backwards capabilities in some case, 
we coold have a special command (like FREEZE) to throw an error when 
we have a backward effect
BrianH
8-Nov-2008
[3045]
Backwards isn't a problem (especially with REVERSE) - stasis is.
Steeve
8-Nov-2008
[3046]
i have to reread the effect of REVERSE
BrianH
8-Nov-2008
[3047x4]
At the very least the infrastructure should be added to detect this 
kind of error so that PARSE tracing can warn about it.
REVERSE changes the direction of PARSE. This helps with recognizing 
language patterns that LR is better at than LL is.
I have some ideas about how REVERSE and should work with backtracking 
that I haven't written down yet.
and ->
Steeve
8-Nov-2008
[3051]
i just have a doubt, in my scripts i like using parse to change a 
serie. Some of this rules are recursives so that they can apply several 
modifications on modifications. In such a case i would break your 
rule to not  return in the same point.
BrianH
8-Nov-2008
[3052x2]
That is why it would be optional.
By recursives you meant iteratives. Recursives don't use loops.
Steeve
8-Nov-2008
[3054x2]
I have a set of rules in an any block wich apply modifications, the 
same block of rules is applied until there is no more modifications.
I don't know h
how to call that
BrianH
8-Nov-2008
[3056]
That is iterative changes. Each round of changes is an iteration.
Steeve
8-Nov-2008
[3057]
ok
BrianH
8-Nov-2008
[3058]
That kind of thing can be very powerful. As long as you have a fixpoint 
(a reachable condition to break you of the loop) you'll be fine.
Steeve
8-Nov-2008
[3059]
yeah i know it's difficult to create, but when it works it's incredibly 
compact and powerfull, i like compactness
BrianH
8-Nov-2008
[3060]
Rewrite rules usually work this way.
Steeve
8-Nov-2008
[3061]
about your proposal i just will say that is not an high request but 
if it's free i take it
BrianH
8-Nov-2008
[3062]
I think it will be better to include this in the tracing infrastructure.
Steeve
8-Nov-2008
[3063x3]
ok another proposal
currently when we parsing a serie we can't mix constant string! and 
constant binaries together
we have to choose depending of the type of the serie
BrianH
8-Nov-2008
[3066x3]
That will be moreso in R3 because string! and binary! aren't compatible 
any more.
Unicode changes. A binary is a series of bytes, a string is a series 
of codepoints.
AS-STRING and AS_BINARY are gone.
Steeve
8-Nov-2008
[3069]
but at certain point a string can be converted to a binary an vice 
versa
BrianH
8-Nov-2008
[3070x2]
Yes, but that is a real conversion, with encoding and decoding.
Just like converting from a .jpg to an image! and back.
Steeve
8-Nov-2008
[3072]
when u say 'real' u mean a large time consuming conversion ?
BrianH
8-Nov-2008
[3073x2]
Hopefully not time-consuming but definitely large, depending on the 
size of the string.
The Unicode changes were pretty significant and deep.
Steeve
8-Nov-2008
[3075x3]
not the subject, but speaking of consuming operations, what about 
the old idea to have subset of serie (range) without the need to 
copy/part series ?
Carl spoken of that in the past
*abouty