r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
22-Aug-2005
[350x3]
Remember, for many platforms, Core 2.5.0 is the current version.
Here's a simplified version of my example that can handle multiple 
instances of multiple markup types and be adapted to different end 
tags (thanks Tomc for the idea!):

markup-chars: charset "*~"
non-markup: complement markup-chars
tag1: ["*" "<strong>" "~" "<i>"]
tag2: ["*" "</strong>" "~" "</i>"]
parse/all data [
    any non-markup
    any [

        ; This next block can be generated if you have many markup types...

        [a: copy b "*" copy c to "*" copy d "*" e: | a: copy b "~" copy c 
        to "~" copy d "~" e: ]
        :a (change/part a rejoin [tag1/:b c tag2/:d] e)
        any non-markup
    ]
    to end
]
Tomc: "you may be better off  with :here skip to gaurentee progress"


Put the skip after the paren and I may agree with you there. Of course 
you would skip the number of chars in the replacement text then.
BrianW
22-Aug-2005
[353x2]
Wow, I'm off getting bored at meetings, come back and you've been 
working hard! Thanks, folks.
Here's what I have right now:

		markup-chars: charset "*_@"
		non-markup: complement markup-chars
		inline-tags: [
			"*" "strong"
			"_" "em"
			"@" "code"
		]

		markup-rule: [
			any non-markup
			any [
				[ a: "*" b: to "*" c: skip d: |
				  a: "_" b: to "_" c: skip d: | 
				  a: "@" b: to "@" c: skip d: ] :a (
					change/part a rejoin [ 
						"<" select inline-tags copy/part a b ">"
						copy/part b c 
						"</" select inline-tags copy/part a b ">"
					] d
				) any non-markup
			]
			to end
		]
		parse text markup-rule
Tomc
22-Aug-2005
[355]
you almost certinly want parse/all
BrianW
22-Aug-2005
[356]
whoops
BrianH
22-Aug-2005
[357]
If you want to guarantee progress with my and your examples (and 
better support multichar markup tags) change the last
  any non-markup
to
  any non-markup | skip
and that would do it.
BrianW
22-Aug-2005
[358]
okay, here's a slightly tweaked version that uses a multichar markup 
tag:

        markup-chars: charset "[*_-:---]"
        non-markup: complement markup-chars
        inline-tags: [
            "*" "strong"
            "_" "em"
            "@" "code"
            "--" "small"
        ]

        markup-rule: [
            any non-markup
            any [
                [ a: "*" b: to "*" c: skip d: |
                  a: "_" b: to "_" c: skip d: | 
                  a: "@" b: to "@" c: skip d: |
                  a: "--" b: to "--" c: skip skip d: ] :a (
                    change/part a rejoin [ 
                        "<" select inline-tags copy/part a b ">"
                        copy/part b c 
                        "</" select inline-tags copy/part a b ">"
                    ] d
                ) any non-markup | skip
            ]
            to end
        ]
        parse/all text markup-rule
BrianH
22-Aug-2005
[359]
Your first charset only needs one -
BrianW
22-Aug-2005
[360]
It passes my simple tests, now I need to throw more interesting tests 
in (multiple tags on the same line, nested tags, whatever)

Thanks BrianH, I'll fix that.
BrianH
22-Aug-2005
[361]
Nested tags of the same type won't work at all unless the start and 
end tags are different, and they won't work here without either recursion 
or a an algorythm that does nesting counts. Be careful with that 
because you'd have to update those counts in parens and you can't 
backtrack through parens.
BrianW
22-Aug-2005
[362]
Lucky for me, the rules don't support nested tags of the same type.


* *strong* text* would probably parse as <strong> </strong>strong 
<strong> text</strong>
BrianH
22-Aug-2005
[363]
Note that my last example keeps track of both the start and eng tags, 
even though I don't need to with the markup chars I used.
BrianW
22-Aug-2005
[364]
I need to test for *strong and _emphasized_ text.* (for example)
BrianH
22-Aug-2005
[365]
Yours will test for *strong and _emphasized* text._ as well right 
now (for example)
BrianW
22-Aug-2005
[366]
works beautifully, no changes needed. You guys rule.
BrianH
22-Aug-2005
[367]
The generated html might not be pretty though :)
BrianW
22-Aug-2005
[368x2]
Pretty comes after I know it works, if at all. :-)
awesome, it even works for "**bold**" text!
Tomc
22-Aug-2005
[370]
and won't touch ***********************************************
BrianH
22-Aug-2005
[371]
Really? The rules look like they'd translate that to <strong></strong> 
repeated many times. It doesn't?
BrianW
22-Aug-2005
[372x4]
Lemme write a test and see
ah, it turns them into <b></b>pairs
10 - Not OK - Rampant asterisks are usually ignored
Expected <<p>***********************************************</p>>

Got <<p><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><b></b><strong></
strong>*</p>
Part of me wants to just ignore it for now and get on to other stuff.
BrianH
22-Aug-2005
[376x3]
Well, if you want exceptions, you gotta code them in. In this case, 
before the block of your markup rules, as an alternate.
Like this (at the beginning of the any block):
    ["**" any "*"] |
(Be back, off to a class)
BrianW
22-Aug-2005
[379]
Well, I got things behaving by specifying " " at the beginning. It's 
a start.
Tomc
22-Aug-2005
[380x2]
ah the 2:46:49 post avoided that
with the opt[] block if what followed wan not a well formed marked 
up string
BrianW
22-Aug-2005
[382x2]
hm. Now I just need to figure out how to allow markup at the start 
of a line. I'll need to look at your code back there.
yay, got it working. It's ugly, but I got it working.
Josh
15-Sep-2005
[384x3]
I'm not seeing a good solution to this at the moment, so I thought 
I would ask for help.  I'm working with parse and I want to create 
a new set of grammar rules dynamically based on the inputed grammar. 
 The simplest example I can present is starting with the following 
rule

	b: [19]

and in the end I want to have this.
        w: {}
	newb: [(append w "the b value is 19") 19 (append w newline)]


While I can insert an APPEND that doesn't contain values from the 
original rules into the block very easily, 

insert tail newb [(append w newline)]


I'm not sure of a way to insert a parenthesized APPEND with a string 
that contains info about the block.  I hope this is clear (or even 
possible), but please ask me questions if it is not.
I suppose the question I'm asking is there a way to force the arguments 
of an expression to evaluate without actually evaluating the expression. 
 Taking the aboe example

a: join "the b value is " first b

insert head newb [(append w a)]

where I want to evaluate 'a while ending up with the expression:

insert head newb [(append w "the b value is 19")]


This seems contrary to the nature of first order functions, but I 
just wanted to check.
Excuse me, first - class functions
Ladislav
15-Sep-2005
[387]
interesting questions, unfortunately not having time to answer any
Romano
15-Sep-2005
[388x2]
a: 19 append/only [] to-paren reduce ['append  'w join "the b value 
is " a]
or

a: 19 append/only [] to-paren compose [append w (join "the b value 
is " a)]
Ingo
15-Sep-2005
[390x3]
You _just_ beat me to it ...
;-)
Forgot to use /only on first try
Josh
15-Sep-2005
[393]
Thank you Romano and Ingo
Graham
21-Sep-2005
[394]
How do you parse for a particular integer value ?

parse [ -1 ] [ integer! ]

but I want

parse [ 1] to fail ...
Geomol
21-Sep-2005
[395x2]
you can do it as a string:
>> parse form [-1] ["-1"]
== true
>> parse form [1] ["-1"]
== false
but as a block... hmmm
Gabriele
21-Sep-2005
[397]
>> parse [-1] [1 1 -1]
== true
>> parse [1] [1 1 -1]
== false
Geomol
21-Sep-2005
[398x2]
lol
How did you do that? Why does it work? :-)
ah of course. Number of instances (1 1 means exactly 1, right?).