World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Henrik 28-Jun-2006 [1053]	reichart, I wrote one in the wikibook, don't know if it's useful.
[unknown: 9] 28-Jun-2006 [1054]	Since you wrote one, do you know of a better one? This is not a reflection on yours, but it is a great way to know what you considered the next best thing.
Tomc 28-Jun-2006 [1055x2]	salvation from regular expressions
Tomc 28-Jun-2006 [1055x2]	I may have added some the the rebol wikibook
BrianH 29-Jun-2006 [1057x3]	To use the simpler of the CS terms: Parse is a rule-based, recursive-descent string and structure parser with backtracking. It is not a parser generator (like Lex/Yacc) or compiler (like most regex engines) - the engine follows the rules directly. Since Parse is recursive-descent it can handle patterns that regular expressions wouldn't be able to. Since Parse backtracks it can handle patterns that ordinary recursive-descent parsers can't. Basically, it puts the text and structure processing abilities of Perl 5 to shame, let alone those of the lesser regex engines. In theory, Perl 6 has caught up with REBOL, but Perl 6 only exists in theory for now. By the time it becomes actual REBOL should surpass it (especially if I have anything to say about it).
	It's pretty easy to demonstrate patterns that regular expressions can't handle. It's only somewhat difficult to demonstrate patterns that can't be handled by a recursive descent parser without backtracking or unlimited lookahead. I have never run into a pattern that can't be handled by Parse in theory - its only limits are in implementation (available memory and recursion depth). I am not qualified to describe its limits. Still, you have to be careful about how you write the rules or they will trip you up.
	A little dry as explanations go, I suppose. You may get better luck by showing some magic parse code tricks :)
Volker 29-Jun-2006 [1060]	Somewhat buzzy: Its a simplified compiler-compiler. Could be used to build a java-compiler (eg such complex syntax), but its also as easy as regex for simpler things. But still readable. (less buzzy: not always that easy due to the poorer lockahead).
BrianH 29-Jun-2006 [1061]	Volker, it's more like it can do what a compiler-compiler can do without needing to compile :) And backtracking is about the same as unlimited lookahead, but more powerful.
[unknown: 9] 29-Jun-2006 [1062]	Thanks Brian, but as is the theme with questions I ask, I don't ask for myself, but rather that the "world" can learn what "we" know. So perhaps you should add your 2 cents to Henriks, and Tom's in a public forum of the Wikibook.
Volker 29-Jun-2006 [1063]	the compiling is no big argument as compiler-compilers are for compiled languages anyway ;) the point is, you can mix a grammar and actions for semantics easy.
BrianH 29-Jun-2006 [1064]	Reichart, I figured as much (hence the "dry" comment). I'll look over the Wikibook and see if I can help.
Volker 29-Jun-2006 [1065]	Your points are ok,only wanted to try somewhat shorter
BrianH 29-Jun-2006 [1066]	Volker, it still might be a good point that you can skip a step with parse, depending on the listener. Parse is more of a compiler-interpreter really. The real point I was making was about the lookahead.
Volker 29-Jun-2006 [1067]	I can plug in handcrafted parsers with some cocos too.
JaimeVargas 29-Jun-2006 [1068]	I agree brian parrse allows you to write interpreter easily. Regarding compilation I guess it does that too. But the problem is more difficult.
Volker 29-Jun-2006 [1069]	aah. a compiler-compiler produces sourcecode to be compiled, but you can interpret data with it.
BrianH 29-Jun-2006 [1070]	Most compiler-compilers have fixed lookahead. Bactracking is equivalent to unlimited lookahead.
Volker 29-Jun-2006 [1071]	i guess that depends on the coco. the point is, a bnf by default, and code inside therules, instead of putting things in vars andprocess later. IMHO.
BrianH 29-Jun-2006 [1072x2]	Jaimie, I meant that parse is itself an interpreter, not a compiler. It interprets compiler specs (or interpreter specs, etc.).
BrianH 29-Jun-2006 [1072x2]	Volker, I've used a lot of compiler-compilers before and reviewed many more, and unlimited lookup or backtracking are rare.
JaimeVargas 29-Jun-2006 [1074]	Brian, In this you are right, is an parse is an interpreter that allows easy construction of other interpreter, which the emphasis on DSLs.
Volker 29-Jun-2006 [1075]	then the advantages of parse are beeing like a compiler-compiler and habving unlimited lookup etc?
BrianH 29-Jun-2006 [1076x3]	Yup :)
	I'm not sure whether not having a seperate tokenizer is a plus or a minus, though.
	I guess you could think of block parsing as using load as a tokenizer.
Volker 29-Jun-2006 [1079x2]	IMHO that would add overhead for the simple things.
Volker 29-Jun-2006 [1079x2]	and you can use parse to tokenize first?
BrianH 29-Jun-2006 [1081]	Two rounds of parsing, one for tokenizing and one to parse? Interesting. That would work if you don't have control over the source syntax - otherwise load works pretty well for simple languages.
Volker 29-Jun-2006 [1082]	Thats where i got the idea: tokenize first and use block-parser :)
BrianH 29-Jun-2006 [1083]	I've been using that approach for XML processing.
Volker 29-Jun-2006 [1084]	sounds good. if one finds a good tokenized representation. I am not an xml-guru :(
BrianH 29-Jun-2006 [1085x2]	My next personal project is to go through the XML/XSL/REST specs and create exactly that. I already have an efficient structure, I just need to fill out the semantics to support the complete logical model of XML.
BrianH 29-Jun-2006 [1085x2]	I am also not an XML guru, but I will be by the time I'm done :)
Volker 29-Jun-2006 [1087]	After i read " go through the XML/XSL/REST specs" ithought soo. Beeing undecised ifiprefer to run away or participate curiously.
BrianH 29-Jun-2006 [1088x2]	Well, I know enough to know where to look to figure out the rest.
BrianH 29-Jun-2006 [1088x2]	Still, "run away" is a common and sensible reaction to XML.
Volker 29-Jun-2006 [1090]	nod
BrianH 29-Jun-2006 [1091]	Later, I must run errands...
Volker 29-Jun-2006 [1092]	cu
Gordon 29-Jun-2006 [1093]	I'm a bit stuck because this parse stop after the first iteration. Can anyone give me a hint as to why it stops after one line. Here is some code: data: read to-file Readfile print length? data 224921 d: parse/all data [thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr thru newline (print index? data)] 1 == false Data contains hundreds of "memos" in a csv file with three fields: Memo, Category and Flag ("0"\|"1") all fileds are enclosed in quotes and separated by commas. It would be real simple if the Memo field didn't contain double quoted words; then parse data none would even work; but alas many memos contain other "words". It would even be simple if the memos didn't contain commas, then parse data "," or parse/all data "," would work; but alas many memos contain commas in the body.
JaimeVargas 29-Jun-2006 [1094]	Does every field is quoted?
MikeL 29-Jun-2006 [1095]	Gordon, can you post a copy of short lines of the data?
Izkata 29-Jun-2006 [1096]	if QuoteStr = "\"", then this looks like it to me: Note , "Category", "Flag" Note , "Category", "Flag" But you don't have a loop or anything - try this: d: parse/all data [ some [ thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr thru newline (print index? data) ] ]
Gordon 29-Jun-2006 [1097]	James: Yes every field is quoted. Izkata: Sorry, I left that out. QuoteStr: to-char 34 probe QuoteStr == #"^""
Izkata 29-Jun-2006 [1098]	hm, I was thinking in C++.... very unusual for me lol
Gordon 29-Jun-2006 [1099]	Do you need to loop? I thought parse looped by itself ie: data: parse data none
Izkata 29-Jun-2006 [1100x2]	not as far as I know
Izkata 29-Jun-2006 [1100x2]	This change in the parse looks like it works: >> data: {"Note", "Category", "Flag" { "Note", "Category", "Flag" { "Note", "Category", "Flag" { "Note", "Category", "Flag" { } == {"Note", "Category", "Flag" Note , "Category", "Flag" Note , "Category", "Flag" Note , "Category", "Flag" } >> QuoteStr: to-char 34 == #"^"" >> d: parse/all data [ [ some [ [ X: thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr [ copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr [ thru newline (print index? :X) [ ] [ ] 1 29 57 85 == true
Gordon 29-Jun-2006 [1102]	Okay, trying it now. I see that the phrase: "print index? data" stays stuck on "1". I see that you have posted a new example. I'll try that. Be right back.
older newer	first last