World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
BrianH 22-Aug-2005 [340] | Whoops, an error. Change: ["*" a: skip b: to "*" c: skip d: | "~" a: skip b: to "~" c: skip d: ] :a ( to: [a: "*" b: to "*" c: skip d: | a: "~" b: to "~" c: skip d: ] :a ( Silly me :( |
Tomc 22-Aug-2005 [341] | w: complement charset "*" rule: [ to "*" here: "*" opt[ copy item some w "*" there: (change/part :here join "" [<strong> item </strong>] :there) ] ] parse/all str [some rule] |
BrianH 22-Aug-2005 [342] | Tomc, that will crash older versions of REBOL, and not work on newer versions. You need to reset the parse position to before the change, before the paren where you make the change. Otherwise parse will be referencing a point off the end of the string at the end of the paren, before you can reset it. This used to crash REBOL so bad the interpreter disappeared. |
Tomc 22-Aug-2005 [343] | brianh please supply a str that fails on current versions, so I can see what you mean |
BrianH 22-Aug-2005 [344] | To fix your example, put a :here after the first there: in your rule. |
Tomc 22-Aug-2005 [345] | still havent found a string that fails , trying all the combos of *'s at the beginning end , middle ... |
BrianH 22-Aug-2005 [346] | In your case you might not have a crash, because you are replacing a short text with a longer one. Still, it's good to remember that bug for future reference. It really tripped me up when I first came across it, back when it still used to crash REBOL. |
Tomc 22-Aug-2005 [347] | yes, shortening the string you are parsing would pull the rug out from under the interperter, (and I was aware that the string was being lengthened) note: setting the parse pointer back to :here will position you before the "*" you may be better off with :here skip to gaurentee progress in the case the change fails |
BrianH 22-Aug-2005 [348x5] | OK, I tried this: parse "abc" [to "bc" a: "bc" (change/part a "b" 2)] It returns true on View 1.3 and Core 2.6, but false on View 1.2 and Core 2.5.0. |
If the change fails it will throw an error. The trick is to put off the paren performing the change until you have gone through enough rules to ensure that the paren contents will succeed. | |
Remember, for many platforms, Core 2.5.0 is the current version. | |
Here's a simplified version of my example that can handle multiple instances of multiple markup types and be adapted to different end tags (thanks Tomc for the idea!): markup-chars: charset "*~" non-markup: complement markup-chars tag1: ["*" "<strong>" "~" "<i>"] tag2: ["*" "</strong>" "~" "</i>"] parse/all data [ any non-markup any [ ; This next block can be generated if you have many markup types... [a: copy b "*" copy c to "*" copy d "*" e: | a: copy b "~" copy c to "~" copy d "~" e: ] :a (change/part a rejoin [tag1/:b c tag2/:d] e) any non-markup ] to end ] | |
Tomc: "you may be better off with :here skip to gaurentee progress" Put the skip after the paren and I may agree with you there. Of course you would skip the number of chars in the replacement text then. | |
BrianW 22-Aug-2005 [353x2] | Wow, I'm off getting bored at meetings, come back and you've been working hard! Thanks, folks. |
Here's what I have right now: markup-chars: charset "*_@" non-markup: complement markup-chars inline-tags: [ "*" "strong" "_" "em" "@" "code" ] markup-rule: [ any non-markup any [ [ a: "*" b: to "*" c: skip d: | a: "_" b: to "_" c: skip d: | a: "@" b: to "@" c: skip d: ] :a ( change/part a rejoin [ "<" select inline-tags copy/part a b ">" copy/part b c "</" select inline-tags copy/part a b ">" ] d ) any non-markup ] to end ] parse text markup-rule | |
Tomc 22-Aug-2005 [355] | you almost certinly want parse/all |
BrianW 22-Aug-2005 [356] | whoops |
BrianH 22-Aug-2005 [357] | If you want to guarantee progress with my and your examples (and better support multichar markup tags) change the last any non-markup to any non-markup | skip and that would do it. |
BrianW 22-Aug-2005 [358] | okay, here's a slightly tweaked version that uses a multichar markup tag: markup-chars: charset "[*_-:---]" non-markup: complement markup-chars inline-tags: [ "*" "strong" "_" "em" "@" "code" "--" "small" ] markup-rule: [ any non-markup any [ [ a: "*" b: to "*" c: skip d: | a: "_" b: to "_" c: skip d: | a: "@" b: to "@" c: skip d: | a: "--" b: to "--" c: skip skip d: ] :a ( change/part a rejoin [ "<" select inline-tags copy/part a b ">" copy/part b c "</" select inline-tags copy/part a b ">" ] d ) any non-markup | skip ] to end ] parse/all text markup-rule |
BrianH 22-Aug-2005 [359] | Your first charset only needs one - |
BrianW 22-Aug-2005 [360] | It passes my simple tests, now I need to throw more interesting tests in (multiple tags on the same line, nested tags, whatever) Thanks BrianH, I'll fix that. |
BrianH 22-Aug-2005 [361] | Nested tags of the same type won't work at all unless the start and end tags are different, and they won't work here without either recursion or a an algorythm that does nesting counts. Be careful with that because you'd have to update those counts in parens and you can't backtrack through parens. |
BrianW 22-Aug-2005 [362] | Lucky for me, the rules don't support nested tags of the same type. * *strong* text* would probably parse as <strong> </strong>strong <strong> text</strong> |
BrianH 22-Aug-2005 [363] | Note that my last example keeps track of both the start and eng tags, even though I don't need to with the markup chars I used. |
BrianW 22-Aug-2005 [364] | I need to test for *strong and _emphasized_ text.* (for example) |
BrianH 22-Aug-2005 [365] | Yours will test for *strong and _emphasized* text._ as well right now (for example) |
BrianW 22-Aug-2005 [366] | works beautifully, no changes needed. You guys rule. |
BrianH 22-Aug-2005 [367] | The generated html might not be pretty though :) |
BrianW 22-Aug-2005 [368x2] | Pretty comes after I know it works, if at all. :-) |
awesome, it even works for "**bold**" text! | |
Tomc 22-Aug-2005 [370] | and won't touch *********************************************** |
BrianH 22-Aug-2005 [371] | Really? The rules look like they'd translate that to <strong></strong> repeated many times. It doesn't? |
BrianW 22-Aug-2005 [372x4] | Lemme write a test and see |
ah, it turns them into <b></b>pairs | |
10 - Not OK - Rampant asterisks are usually ignored Expected <<p>***********************************************</p>> Got <<p><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><strong></ strong>*</p> | |
Part of me wants to just ignore it for now and get on to other stuff. | |
BrianH 22-Aug-2005 [376x3] | Well, if you want exceptions, you gotta code them in. In this case, before the block of your markup rules, as an alternate. |
Like this (at the beginning of the any block): ["**" any "*"] | | |
(Be back, off to a class) | |
BrianW 22-Aug-2005 [379] | Well, I got things behaving by specifying " " at the beginning. It's a start. |
Tomc 22-Aug-2005 [380x2] | ah the 2:46:49 post avoided that |
with the opt[] block if what followed wan not a well formed marked up string | |
BrianW 22-Aug-2005 [382x2] | hm. Now I just need to figure out how to allow markup at the start of a line. I'll need to look at your code back there. |
yay, got it working. It's ugly, but I got it working. | |
Josh 15-Sep-2005 [384x3] | I'm not seeing a good solution to this at the moment, so I thought I would ask for help. I'm working with parse and I want to create a new set of grammar rules dynamically based on the inputed grammar. The simplest example I can present is starting with the following rule b: [19] and in the end I want to have this. w: {} newb: [(append w "the b value is 19") 19 (append w newline)] While I can insert an APPEND that doesn't contain values from the original rules into the block very easily, insert tail newb [(append w newline)] I'm not sure of a way to insert a parenthesized APPEND with a string that contains info about the block. I hope this is clear (or even possible), but please ask me questions if it is not. |
I suppose the question I'm asking is there a way to force the arguments of an expression to evaluate without actually evaluating the expression. Taking the aboe example a: join "the b value is " first b insert head newb [(append w a)] where I want to evaluate 'a while ending up with the expression: insert head newb [(append w "the b value is 19")] This seems contrary to the nature of first order functions, but I just wanted to check. | |
Excuse me, first - class functions | |
Ladislav 15-Sep-2005 [387] | interesting questions, unfortunately not having time to answer any |
Romano 15-Sep-2005 [388x2] | a: 19 append/only [] to-paren reduce ['append 'w join "the b value is " a] |
or a: 19 append/only [] to-paren compose [append w (join "the b value is " a)] | |
older newer | first last |