r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[I'm new] Ask any question, and a helpful person will try to answer.

mhinson
3-May-2009
[1994]
Hi, I have been studying the example from Pekr and developed the 
following addaptation.

b: [to "bb" break]
y: [to "yy" break]

parse/all {zyybbc} [ some [b | y  break | skip] copy result thru 
"c"  (print result) ]


however this seems to loop for ever, but I don't understand why. 
Any wise words would be appreciated.  Sorry to be so needy, I am 
begining to wonder that as I am having so much trouble with this 
basic stuff, perhaps this is a programing language that I am just 
not suited to?  Let me know if I am causing a problem by posting 
here so often & I will admit defeat with the parsing & go back to 
something more familiar.s
[unknown: 5]
3-May-2009
[1995]
what are you trying to accomplish?
mhinson
3-May-2009
[1996]
I am really trying to learn the parse grammer rules, but they trip 
me up at every turn.
[unknown: 5]
3-May-2009
[1997]
There is some gotchas but practice is key to them.
mhinson
3-May-2009
[1998]
the example from Pekr was to show me how to backtrack in order to 
make the ["b" | "y"] construct work in way that is non-greedy, that 
is to say so it finds the first occurance of either.
[unknown: 5]
3-May-2009
[1999]
ahhhh i see.
mhinson
3-May-2009
[2000]
my modification to add "to" seems to break it & I dont feel I have 
enough of a grip on the syntax to work out why.
Maxim
3-May-2009
[2001]
mhinson... for backtracking, at least in my experience, using to 
and thru are pretty dangerous... its pretty much the same as using 
a goto within structured programming.
mhinson
3-May-2009
[2002]
eek
Maxim
3-May-2009
[2003x2]
its creates something of a parralel to a closure.
it can be usefull when you know exactly what can happend, but otherwise 
you are better of skipping.
mhinson
3-May-2009
[2005]
I was engaging the use of "to" in the belief that it was the only 
way to include the key I was using to find my data in the output 
from my copy.
Maxim
3-May-2009
[2006x4]
I find that there several types of broad parse rule families.
-some find and extract values OUT of some arbitrary string.

-some are used as advanced controled structures, where you can pretty 
much simulate lisp functional programming,
-some are used to convert data streams

-others simply can match a whole data set to make sure it conforms.
(+probably others)


strangely each type of setup has sort of a different global approach, 
but its very case specific obviously.
generally, you should realise that when you build parse rules to 
need to have some sort of sence of  "context"   i.e. Where are you 
in your data. this will help you a lot.
where as in at what point in the string AND where in the sense of 
a what step are you now in the parsing.
thinking in those terms makes the rules much easier to place in what 
I like to call  "PARSE SPACE"  where everything is a little bit odd 
 ;-)f
mhinson
3-May-2009
[2010]
The type I am most interested in at the moment are constructs to 
extract data from strings read from files. Sometimes changing the 
subsequent behaviour dependent on what has alredy been found.
Maxim
3-May-2009
[2011]
ok.  I have only about 10 minutes to give you, but I'll make the 
best of them.
mhinson
3-May-2009
[2012]
Thank you
Maxim
3-May-2009
[2013]
lesson one, find text AFTER some identifiable (and unmistakable) 
token:
mhinson
3-May-2009
[2014]
That I think I can do
Maxim
3-May-2009
[2015x2]
given:

data: "THIS IS A TEST STRING WITH A <TAG> IN IT "
if we want to extract what is after that single tag, then you can 
easily use to or even better thru:


but lets do it using skip.  starting with a simple example will make 
the lesson 2 more obvious.
mhinson
3-May-2009
[2017]
great
[unknown: 5]
3-May-2009
[2018]
What would be ideal is to be able to use TO on charsets within the 
parse rule.
mhinson
3-May-2009
[2019]
parse/all data [thru "<TAG>" copy result to end] print result
Maxim
3-May-2009
[2020x3]
parse/all data [
	some [ 

  ["<TAG>" here: (print rejoin ["we are passed the <TAG!> : '" here 
  "'"]) ] | skip
	]
]
note, we are using skip, to get familliar with the basics of rollback...
does the above make sense to you?
mhinson
3-May-2009
[2023]
does "some" make it repeat?
Maxim
3-May-2009
[2024]
yes, and at least one rule must match or else it fails and rolls 
back to parent. rule
mhinson
3-May-2009
[2025]
but skip will always succeed?
Maxim
3-May-2009
[2026]
so can you explain to me why the ["<TAG" ... ] rule is BEFORE the 
skip rule?
mhinson
3-May-2009
[2027]
is it because thay are itterated in order?
Maxim
3-May-2009
[2028x4]
that is the reason for the lesson... once you grasp that you are 
about 50% there already  :-)
some will loop over and over until ALL rules fail. if that happens 
then IT fails too.
'SOME  will loop ....
so why did it only print the string once?  simple question... but 
not so obvious to answer for a newbie  :-)
[unknown: 5]
3-May-2009
[2032]
Here is a crude way to do what we did earlier that doesn't use parse 
but matches any part to any first occurence of the letter

>> chars: charset "by"
== make bitset! #{
0000000000000000000000000400000200000000000000000000000000000000
}
>> copy/part z: find {aabbyyc} chars next find z "c"
== "bbyyc"
mhinson
3-May-2009
[2033]
because the match only occured once?
Maxim
3-May-2009
[2034x2]
yep... it only was able to match the first rule when got AT that 
point in the data
every other time (including after it matched the tag) the second, 
alternative, rule matched, and we where able to hit the end of the 
string.
mhinson
3-May-2009
[2036]
<TAG>
 here:

here only gets the remaining part of the string assigned to it if 
the previous key parse returns true
Maxim
3-May-2009
[2037]
set words within parse rules are set to the point in the series that 
the parser IS , AT that exact moment.
mhinson
3-May-2009
[2038]
I read about that & how [index? here] will return that position
Maxim
3-May-2009
[2039x2]
so if you had put the HERE: like so:
[here: "<TAG>" ...  

then the tag itself would be part of the printed string.
but note that at that point, it has not matched the "<TAG>"  yet 
 !
mhinson
3-May-2009
[2041x2]
so is the first position in the parse expression special in that 
it is always evaluated?
I see it is     
parse {123} [(print "hello")]
Maxim
3-May-2009
[2043]
it evaluates until it finds a given character that it cannot match