r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Graham
13-Aug-2005
[294]
np
BrianW
13-Aug-2005
[295x2]
Hm. Still having issues:

>> rule: [ copy chunk to "^/^/" (append lines chunk) skip 2 ]
== [copy chunk to "^/^/" (append lines chunk) skip 2]
>> text: "Dude.^/^/Sweet!^/"
== "Dude.^/^/Sweet!^/"
>> lines: copy []
== []
>> parse/all text rule
** Script Error: Invalid argument:
** Near: parse/all text rule
>>
oh, wait.
Graham
13-Aug-2005
[297]
oops .. should be "2 skip" and not "skip 2"
BrianW
13-Aug-2005
[298x2]
That's okay, I forgot to include the possibility that there would 
be only one line, so my tests were blowing up somewhere else.
Hm, that misses the last line if the text doesn't end in "^/^/"
Graham
13-Aug-2005
[300]
easy enough .. add {^/^} to end of text before parsing
BrianW
13-Aug-2005
[301x2]
That *is* a nice and easy solution, thanks a lot!
sweet, all my tests pass now. Time to add more tests :-)
Graham
13-Aug-2005
[303]
sometimes I find it easier to change the data than to change the 
rule :)
BrianW
13-Aug-2005
[304x2]
Hey, as long as it works.
Working on a textile parser over here to build my 'parse skills and 
make it easier to build my website with Rebol
Graham
13-Aug-2005
[306]
not that you shouldn't do it, but I use http://www.rebol.it/~steel/retools/remark/
BrianW
13-Aug-2005
[307]
I'm <>-phobic ;-)
Graham
13-Aug-2005
[308]
Remark takes care of all of those taggy things
BrianW
13-Aug-2005
[309]
and all of my pages are already in textile format, and I think a 
few of my friends would be more interested in Rebol if I had a textile 
parser for them
Graham
13-Aug-2005
[310]
what's textile ?  A type of fabric ?
BrianW
13-Aug-2005
[311]
http://hobix.com/textile/
Graham
13-Aug-2005
[312]
A structured text variant ...
BrianW
13-Aug-2005
[313x2]
yep
I like some of the different structured text formatting systems
shadwolf
14-Aug-2005
[315]
Volker thank you it works great now and the code rule is tiny ;)
Volker
14-Aug-2005
[316]
:)
BrianW
18-Aug-2005
[317]
Any parse suggestions for trying to find #"(" without a matching 
#")" in text that might also have proper pairs of parens?
Henrik
18-Aug-2005
[318]
you probably need to count them and see where you end up after finding 
all parens. I'm not sure if it can be used to see which are missing...
BrianW
18-Aug-2005
[319]
That would probably work fine. This if for the textile parser, where 
a declaration like "p(." means a paragraph with left margin of 1em, 
repeated for additional ems of margin. Counting will be quite useful.
Henrik
18-Aug-2005
[320]
count one up on #"(" and one down on #")". If correct, the end result 
is zero.
BrianW
18-Aug-2005
[321x3]
thanks
Perfect, Henrik. That took me exactly where I needed to go for this 
feature.
Gonna have to work on my test-simple.r script soon to provide better 
summaries. The number of tests that are passing in this thing is 
getting rather large!
BrianW
22-Aug-2005
[324]
Any tips on how to convert " *text* " to " <strong>text</strong>"?
Sunanda
22-Aug-2005
[325]
One way:
   replace text "*" <strong>
   replace text "*" </strong>

If there are multiple pairs of "*", repeat in a loop until the length 
no longer changes
Graham
22-Aug-2005
[326x2]
You should look at make-doc text to see how it parses stuff.  I believe 
it's a similar problem.
source not text.
Geomol
22-Aug-2005
[328]
Brian, you can look at my NicomDoc format http://home.tiscali.dk/john.niclasen/nicomdoc/

Look for the 'magic' in "nicomdoc.r", where you'll find rules for 
such things. (I guess, you have to handle multiple ****.
BrianW
22-Aug-2005
[329]
ah, thanks. Sunanda, that solution won't quite work if a #"*" appears 
without a match. I'll go look at NicomDoc
BrianH
22-Aug-2005
[330x2]
parse/all data [any [to "*" a: skip b: to "*" c: skip d: :a (change/part 
a rejoin ["<strong>" copy/part b c "</strong>"] d)] to end]
You can make it a little more complicated to add more markup types, 
but the basic structure is the same. The trick is the :a before the 
paren - otherwise it won't work, and you can crash older versions 
of REBOL.
Tomc
22-Aug-2005
[332x2]
something along the lines of   (untested)
;;; make the word set more restrictive if no space etc
;;; but this is most permissive for your example
word: complement charset "*"
rule: [
	skip to "*" 
	[copy item some word "*"(append output join[<tag> item </tag>])]
	| skip
BrianW
22-Aug-2005
[334x2]
That works nicely too! I'll look more at NicomDoc later, but BrianH's 
tip makes tests for "*test*" and "*test" pass
I'll have to explore Tomc's solution when I get back from my meeting. 
Thanks, folks
BrianH
22-Aug-2005
[336x5]
markup-chars: charset "*~"
non-markup: complement markup-chars
tag1: ["*" "<strong>" "~" "<i>"]
tag2: ["*" "</strong>" "~" "</i>"]
parse/all data [
    any non-markup
    any [

        ["*" a: skip b: to "*" c: skip d: | "~" a: skip b: to "~" c: skip 
        d: ] :a (
            change/part a rejoin [
                select tag1 copy/part a b
                copy/part b c
                select tag2 copy/part c d
            ] d
        ) any non-markup
    ]
    to end
]
No nesting, but with a little recursion and different start and end 
tags, this can be adapted to handle that too.
If you want to determine whether there have been any replacements, 
change the second any to some and parse will return true only when 
replacements have been made. Be careful to avois use of the markup 
characters in your replacement text.
avios: avoid
Whoops, an error. Change:

        ["*" a: skip b: to "*" c: skip d: | "~" a: skip b: to "~" c: skip 
        d: ] :a (
to:

        [a: "*" b: to "*" c: skip d: | a: "~" b: to "~" c: skip d: ] :a (

Silly me :(
Tomc
22-Aug-2005
[341]
w: complement charset "*"
rule: [	
	to "*" here: "*"
	opt[ 
		copy item some w "*" there:
		(change/part :here join "" [<strong> item </strong>] :there)
	]
]
parse/all str [some rule]
BrianH
22-Aug-2005
[342]
Tomc, that will crash older versions of REBOL, and not work on newer 
versions. You need to reset the parse position to before the change, 
before the paren where you make the change. Otherwise parse will 
be referencing a point off the end of the string at the end of the 
paren, before you can reset it. This used to crash REBOL so bad the 
interpreter disappeared.
Tomc
22-Aug-2005
[343]
brianh   please supply a str that fails on current versions, so I 
can see what you mean