World: r3wp

Join the discussions in the REBOL3 world...

[I'm new] Ask any question, and a helpful person will try to answer.

older newer	first last
Maxim 1-May-2009 [1968]	mike: the best tip I can give you is to picture a cursor in your head when you type your rules. parse really is about iterating over a series on byte at a time and matching the next expected characetr.
Gregg 1-May-2009 [1969]	My example above may not be exactly what you want. e.g. you might want to clear mark-1: in the skip rule.
Maxim 1-May-2009 [1970x6]	whenever it doesn't find what it expects, it "rolls back" to the last complete finished rule and tries the next one, if an alternative was given. this is hiercharchical. so if youre inside the 15th depth of parse ans suddenly an unexpected character is encountered, you might roll back up to the very first rule!
	if that is the first alternative given within the parse rules.
	since it only moves forward, its very fast. in order to do look ahead assertion and things like that (which are slow in regexp anyways) you must learn a few tricks in order to manually set and retrieve the "current" character index.
	also note that when there is a "roll back", the cursor and rules rolls back together. (unless you are doing manual cursor manipulation tricks)
	so if your first rule only matched 2 characters, the second one fails, and other alternative is given... the alternative effectively starts checking at character 3
	even if the second rule failed 15 rules deep at character 3000
Graham 1-May-2009 [1976x5]	I presume that .txt is not going to appear in the 'random' text?? And XXX is really a fixed set of characters?
	this is a cheat ... >> s: {randomXXXrandom.log XXXrandom.txtrandom} == "randomXXXrandom.log XXXrandom.txtrandom" >> parse/all reverse s [ to "txt." copy txt to "XXX" ( reverse txt ) to end ] == true >> txt == "random.txt"
	that's for those of us who don't like to backtrack :)
	or if .log neve appears in the random text you can just skip past it
	>> parse/all s [ thru ".log" thru "XXX" copy txt thru ".txt" to end ] == true >> txt == "random.txt"
PeterWood 2-May-2009 [1981]	I usually adopt a different approach which is to write a rule to match my target and use any and skip to apply that rule progressively through the input. It may not be the fastest way but it seems easier to grasp than backtracking. >> haystack: {randomXXXrandom.log XXXrandom.txtrandom} >> alpha: charset [#"a" - #"z" #"A" - #"Z"] == make bitset! #{ 0000000000000000FEFFFF07FEFFFF0700000000000000000000000000000000 } >> digit: charset [#"0" - #"9" ] == make bitset! #{ 000000000000FF03000000000000000000000000000000000000000000000000 } >> alphanumeric: union alpha digi t == make bitset! #{ 000000000000FF03FEFFFF07FEFFFF0700000000000000000000000000000000 } >> needle: ["XXX" some alphanumeric ".txt"] >> parse/all haystack [any [copy result needle (print skip result 3) \| skip]] random.txt == true As you can see with this approach, you have to manually extract the "XXX" once you've grasped the needle.
mhinson 2-May-2009 [1982]	You may be surprised to hear that I understand what you are all saying. I was sort of already begining to draw my own conclusion that the answer to my problem might not be as trivial as I wanted it to be. I think this mini tutorial on parse for people with an understanding of regular expressions would be very handy for others, could the essence of your replies be put somewhere that they will be found by other newbies? What fun this is, thanks for the continued help.
Sunanda 2-May-2009 [1983x3]	Mike, this group has [web-public] in its title, meaning it is being published to the web. So the replies are available online for anyone. http://www.rebol.org/aga-display-posts.r?post=r3wp174x1957 A tutorial to put it all in content would be even better.
	If you have not found it already, this is a detailed tutorial on 'parse, building up slowly from the basics: http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse
	And this gives you easy access to many useful mailing list threads -- often containing "worked examples" in answer to specific problems: http://www.rebol.org/ml-topic-index.r?i=parse
mhinson 2-May-2009 [1986]	I am becoming dispondent today because although I now feel I understand some very usefull concepts & have all these great links to documentation, I still stumble over the syntax of very basic things. e.g this is wrong & one of dozens of different ways I have tried to arange the brackets & quote marks. It is frustrating to spend an hour or so making no progress at all on something I know must be obvious. result: [] parse/all {zabc} [ to ["b" \| "y"] copy result thru "c" (print result) ] parse/all {zybc} [ to ["b" \| "y"] copy result thru "c" (print result) ]
Izkata 2-May-2009 [1987]	Put the 'to inside the block, like this: parse/all {zabc} [ [to "b" \| to "y"] copy result thru "c" (print result) ] IIRC, your syntax has been a long-standing request to improve 'parse
mhinson 2-May-2009 [1988]	Thanks Izkata, the is exactly the help I needed. Since I failed to find the information about this in the documentation I thought I would try & find it now I know the answer, but I still can't. Should it be infered from some basic principle I have missed? Or am I just bad at searching the documents?
Pekr 2-May-2009 [1989x2]	to "b" \| to "y" might not work as you would expect though. Better check what you are expecting, because you should really read it as - try to locate "b", and in the case you will not find it, try to locate "y". It simply does not mean find first occurance of "b" or "y", return the first match ...
Pekr 2-May-2009 [1989x2]	Such functionality is long time request to parse enhancement, and is planned to be implemented ...
mhinson 2-May-2009 [1991]	Pekr, I see now this is the same newbie trap you pointed out to me over 2 weeks ago. Difference is that this time I understand what you mean. (i hope). I think I need to study backtracking now. I just discovered this discussion that seems very helpfull on the subject. http://www.mail-archive.com/[list-:-rebol-:-com]/msg03371.html I wonder if I should start using Rebol 3 ? does it have this enhancement yet? is it stable enough for educational use would you say? or is is really only the domain of testers as yet? Thanks.
Pekr 3-May-2009 [1992x2]	no R3 parse enhancement is implemented yet, but they are planned for R3.0 beta release, so hopefully they come. I am not sure Carl will implement all of them, but maybe he will. You can look at it here - http://www.rebol.net/wiki/Parse_Project
Pekr 3-May-2009 [1992x2]	Here's my version: parse/all {zybc} [ some ["b" break \| "y" break \| skip] copy result thru "c" (print result) ] Simple explanation - we try to match at least one occurance of "b" or "y. There is currently no other chance than skip by one char (don't worry, it is not slow). Once you reach the char, you have to "break", or the rule will be still applied, because "skip" will always occur, even if "b" or "y" are not matched.
mhinson 3-May-2009 [1994]	Hi, I have been studying the example from Pekr and developed the following addaptation. b: [to "bb" break] y: [to "yy" break] parse/all {zyybbc} [ some [b \| y break \| skip] copy result thru "c" (print result) ] however this seems to loop for ever, but I don't understand why. Any wise words would be appreciated. Sorry to be so needy, I am begining to wonder that as I am having so much trouble with this basic stuff, perhaps this is a programing language that I am just not suited to? Let me know if I am causing a problem by posting here so often & I will admit defeat with the parsing & go back to something more familiar.s
[unknown: 5] 3-May-2009 [1995]	what are you trying to accomplish?
mhinson 3-May-2009 [1996]	I am really trying to learn the parse grammer rules, but they trip me up at every turn.
[unknown: 5] 3-May-2009 [1997]	There is some gotchas but practice is key to them.
mhinson 3-May-2009 [1998]	the example from Pekr was to show me how to backtrack in order to make the ["b" \| "y"] construct work in way that is non-greedy, that is to say so it finds the first occurance of either.
[unknown: 5] 3-May-2009 [1999]	ahhhh i see.
mhinson 3-May-2009 [2000]	my modification to add "to" seems to break it & I dont feel I have enough of a grip on the syntax to work out why.
Maxim 3-May-2009 [2001]	mhinson... for backtracking, at least in my experience, using to and thru are pretty dangerous... its pretty much the same as using a goto within structured programming.
mhinson 3-May-2009 [2002]	eek
Maxim 3-May-2009 [2003x2]	its creates something of a parralel to a closure.
Maxim 3-May-2009 [2003x2]	it can be usefull when you know exactly what can happend, but otherwise you are better of skipping.
mhinson 3-May-2009 [2005]	I was engaging the use of "to" in the belief that it was the only way to include the key I was using to find my data in the output from my copy.
Maxim 3-May-2009 [2006x4]	I find that there several types of broad parse rule families. -some find and extract values OUT of some arbitrary string. -some are used as advanced controled structures, where you can pretty much simulate lisp functional programming, -some are used to convert data streams -others simply can match a whole data set to make sure it conforms. (+probably others) strangely each type of setup has sort of a different global approach, but its very case specific obviously.
	generally, you should realise that when you build parse rules to need to have some sort of sence of "context" i.e. Where are you in your data. this will help you a lot.
	where as in at what point in the string AND where in the sense of a what step are you now in the parsing.
	thinking in those terms makes the rules much easier to place in what I like to call "PARSE SPACE" where everything is a little bit odd ;-)f
mhinson 3-May-2009 [2010]	The type I am most interested in at the moment are constructs to extract data from strings read from files. Sometimes changing the subsequent behaviour dependent on what has alredy been found.
Maxim 3-May-2009 [2011]	ok. I have only about 10 minutes to give you, but I'll make the best of them.
mhinson 3-May-2009 [2012]	Thank you
Maxim 3-May-2009 [2013]	lesson one, find text AFTER some identifiable (and unmistakable) token:
mhinson 3-May-2009 [2014]	That I think I can do
Maxim 3-May-2009 [2015x2]	given: data: "THIS IS A TEST STRING WITH A <TAG> IN IT "
Maxim 3-May-2009 [2015x2]	if we want to extract what is after that single tag, then you can easily use to or even better thru: but lets do it using skip. starting with a simple example will make the lesson 2 more obvious.
mhinson 3-May-2009 [2017]	great
older newer	first last