World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Ladislav 27-Apr-2011 [5664x2] | And, surely, one of the rules having this property is the END rule. |
So, while the TO rule advances to the head of the subrule match, the THRU rule advances to the tail of the subrule match, which happen to be identical in case the subrule match does not advance the cursor. | |
Geomol 27-Apr-2011 [5666] | Argh, I was confused by sentences like Where do you think the cursor is after matching the [end] rule? :-) Old def. of thru: advance input thru a value or datatype New def. of thru: scan forward in input for matching rules, advance input to tail of the match Then it can be argued, the tail of end (of a series) is still the end. Or it can be argued, that thru always advance some way further (as in all cases except at end). I understand, why [thru end] not failing is confusing. (And stop saying, I should read the doc. I have read it ... in full this time.) ;-) |
Ladislav 27-Apr-2011 [5667x2] | it can be argued, that thru always advance some way further - actually it cannot be argued, taking into account, that it has been documented |
I was confused by sentences like Where do you think the cursor is after matching the [end] rule? Interesting, so, you do not know where the cursor is after matching the [end] rule? Otherwise, such a question cannot confuse anybody knowing where the cursor is. | |
Geomol 27-Apr-2011 [5669] | That I don't agree with. I don't see, it say anywhere in the doc, that thru does not advance the input. |
Ladislav 27-Apr-2011 [5670x2] | That is not what I said |
I said, that, in general, PARSE may, or may not advance the input after successfully matching a rule. Which is true. | |
Geomol 27-Apr-2011 [5672x3] | Interesting, so, you do not know where the cursor is after matching the [end] rule? I assume, I know it, when it comes to parse in R2. I'm not sure with R3, as I don't have much experience with that. |
Yes, true about parse may or may not. | |
I'm quite impressed (or surprised at least), how much parse has grown from R2 to R3. I haven't studied it close, but what's your opinion? Is all those rules necessary. I feel, it might be too complex to use? | |
Ladislav 27-Apr-2011 [5675x4] | And, I guess, that the "past" word does not express as clearly what Carl had in mind, and that he wanted to explain the general case, explaining all possible outcomes of subrule matching. |
(in one sentence) | |
...how PARSE has grown from R2 to R3 - actually, not at all. That is only a superficial difference. As can be seen in the http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse/Parse_expressions article, (especially it is obvious when the "Idioms" section is examined), all the constructs from R3 are possible in R2 as well. | |
See also http://www.rebol.org/view-script.r?script=parseen.r | |
Geomol 27-Apr-2011 [5679] | Oh, ok. |
Ladislav 27-Apr-2011 [5680x3] | The "Idioms" section actually suggests, that even some R2 constructs are "superfluous" in the sense, that they can be derived from more elementary constructs like sequence and choice. |
For example, a: [opt b] is actually the same as a: [b |] | |
etc. | |
Geomol 27-Apr-2011 [5683] | Interesting. Could it be an idea to 'create' a minimum parse? Maybe just the specification. |
Ladislav 27-Apr-2011 [5684] | You should examine the "Idioms" section to get the idea. |
Geomol 27-Apr-2011 [5685x3] | I will. |
Has anyone made PARSE as a function? It should be possible, right? | |
Found a trick to parse integers in blocks. Let's say, I want to parse this block: [year 2011] The rule can't be ['year 2011], because 2011 in this case is a counter for number of next element (none here). So normally, I would do something like ['year set y integer! ( ... )] and checking the y variable and create a fail rule, in case it's not 2011. But this is the trick: >> parse [year 2011] ['year 1 1 2011] == true Two numbers mean repeat the next pattern a number of times, and in this case, the pattern can be an integer itself. | |
onetom 27-Apr-2011 [5688] | :) nice |
Gregg 27-Apr-2011 [5689] | I wouldn't call it a trick John, just a non-obvious syntax. I haven't used it much, but I wrote a func a long time ago when I needed it for something. literalize-int-rules: func [template /local mark] [ ; Turn a single integer value into a quantity-of-one integer ; rule for parse (e.g. 1 becomes 1 1 1, 4 becomes 1 1 4). rule: [ any [ into rule | mark: integer! (insert mark [1 1]) 2 skip | skip ] ] parse template rule template ] |
Ladislav 27-Apr-2011 [5690] | Yes, John, handling of such values has been discussed a while ago. That is why in R3 the QUOTE directive has been defined. |
Geomol 28-Apr-2011 [5691x2] | Nice! |
In http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse/Parse_expressions#Parse_idioms The idiom Description: "Range of times operator" Operation: a: [m n b] Idiom: a: [m b (k: n - m) [k [b | c: fail] | :c]] only seem to be true, when n >= m. When n < m, parse works as if the rule was a: [n b] | |
Ladislav 28-Apr-2011 [5693x4] | That is somewhat surprising, do you see any difference? |
(I don't) | |
aha, sorry, you are right | |
Corrected, should be better now. | |
Sunanda 29-Apr-2011 [5697] | Can an R2 parse expert help me with an efficient parse, please? I've got a set of bbcode-type tags, eg: tags: [ "[a]" "[b]" "[cc]" ] And I've got a data string that includes those (and other) tags, eg: data: "xxxx[a]aa aa[b]xxxx[a] yyyy[d]yyy[cc]dd[e]ddd[b][A]zz[zz" What I'd like is the data string split at the designated tags, eg: [ "[a]" "aa aa" "[b]" "xxxx" "[a]" " yyyy[d]yyy" "[cc]" "dd[e]ddd" "[b]" "" "[A]" "zz[zz" ] Thanks! |
Maxim 29-Apr-2011 [5698] | rebol [] =tags=: [ "[a]" | "[b]" | "[cc]" ] data: "xxxx[a]aa aa[b]xxxx[a] yyyy[d]yyy[cc]dd[e]ddd[b][A]zz[zz" blk: [] parse/all data [ start: any [ here: copy tag =tags= there: ( append blk copy/part start here append blk tag ) start: | skip ] (append blk start) ] ?? blk ask "" |
Steeve 29-Apr-2011 [5699] | should be better including [ to "[" ] at the right place |
Maxim 29-Apr-2011 [5700x2] | no since notice that he's not loading all [tags] just those he really wants. |
(maybe I misunderstood why you'd want a [ to "[" ] :-) | |
Steeve 29-Apr-2011 [5702x2] | even if, just replace >skip by > skip opt to "[" (not tested) |
better in the sense: faster | |
Sunanda 29-Apr-2011 [5704] | Steeve, that looks good, thanks! Only difference from my "expected results" is that you've also returned the "pre-tag" "xxxx" .... that's okay -- incidental issues like that are completely negotiable in the search for a solution. |
Steeve 29-Apr-2011 [5705x2] | >[skip to "[" | to end] should be even better |
> [skip to "[" | end skip] skip an extra loop by exiting with a fail | |
Maxim 29-Apr-2011 [5707] | sunanda, wrt first elemetn, I thought it was a typo on your part ;-) |
Sunanda 29-Apr-2011 [5708] | :) --- in the real-life app, I'd insert a dummy tag at the start to hoover up any pre-tag data. |
Steeve 29-Apr-2011 [5709] | Maxim can alter its parser to avoid such ack, easly task :-) |
Geomol 29-Apr-2011 [5710] | In R2: >> parse [a b c [d e f] g h i] [to [d e f] mark: (probe mark) to end] [[d e f] g h i] == true Here the block after TO isn't a sub-rule, but a value to search for (a block of words). Doing the same in R3: >> parse [a b c [d e f] g h i] [to [d e f] mark: (probe mark) to end] ** Script error: PARSE - invalid rule or usage of rule: e Is the block a sub-rule here? I've tried to search the docs, but haven't found an explanation. |
BrianH 29-Apr-2011 [5711x3] | TO and THRU were changed to support multi-rules, so they aren't really comparable to their R2 versions. And there are some bugs in the implementation where some rules that don't match the acceptable syntax are just treated as not matching instead of triggering an error the way they should. This has made it difficult to properly document their current behavior. |
PARSE is definitely something I wish was more open, because there are bugs I would like to fix. | |
I think that there is no direct equivalent in R3 to R2's TO/THRU inline block. R3's TO/THRU inline block treats the block as its sub-dialect for TO/THRU multi, and that doesn't allow complex values or more than one value in a single alternate. The direct R3 equivalent of what you are requesting would be this, but it doesn't work: >> parse [a b c [d e f] g h i] [to [[d e f]] mark: (probe mark) to end] ** Script error: PARSE - invalid rule or usage of rule: [d e f] Instead you have to do a trick with to block! in a loop and then match the block to quote [d e f] explicitly, keeping looking if it doesn't match. It's annoying. | |
older newer | first last |