r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Graham
11-Oct-2005
[503x3]
yes.
Do we have a final function that works yet?
it might be easier to to a parse/all txt #" " and then build up the 
lines one at a time.
Tomc
11-Oct-2005
[506x2]
I can get rid of the infinate nones by making an error ...
but that dosent really count
Ladislav
11-Oct-2005
[508]
split-text: func [
	txt n 
	/local frag result bl stop-rule
][
    bl: copy []
    result: copy ""
    stop-rule: none
    end-rule: [to end (stop-rule: [end skip])]
    frag-rule: [

  copy frag [n skip | end-rule] (frag: any [frag ""] print frag append 
  result frag)

  copy frag [to #" " | end-rule] (frag: any [frag ""] print frag append 
  result frag append bl copy result clear result)]
    parse/all txt [some [frag-rule stop-rule]]
    bl
]
Tomc
11-Oct-2005
[509]
split-text: func [txt [string!] n [integer!]
    /local frag fraglet bl frag-rule bs ws
][  ws: charset [#" " #"^-" #"^/"]
    bs: complement ws
    bl: copy []
    frag-rule: [
        any ws
        copy frag [
            [1 n skip 
                opt[copy fraglet some bs]
            ]
            | to end skip
        ]
        (all [fraglet join frag fraglet]
         insert tail bl frag 
         print frag
        )
    ]
    parse/all txt [some frag-rule]
    bl
]
Graham
11-Oct-2005
[510]
I have to say Tomc is a winner !
Tomc
11-Oct-2005
[511]
it was the copy frag 1 n skip
Graham
11-Oct-2005
[512]
Unspecified .. but tom's function removes leading white space as 
well.  Ladislav's preserves whitespace.
Ladislav
11-Oct-2005
[513]
yes
Graham
11-Oct-2005
[514]
should stick both up on the library ..
Tomc
11-Oct-2005
[515x2]
can easily commont out the any ws line if that is not desired  ... 
or make it a refinement
good night thanks for the puzzle
Graham
11-Oct-2005
[517x2]
thanks for the input.
and thanks both for the solution.
Ladislav
11-Oct-2005
[519]
good night, Tom
Graham
11-Oct-2005
[520x2]
Next Devcon they should set aside a few mins for some programming 
contests :)
and virtual spectators being allowed to compete.
Ammon
11-Oct-2005
[522x2]
Not a half bad idea.
Word on the street is that you're offering to host this next one.
Ladislav
11-Oct-2005
[524]
hi Ammon, missed you in Italy
Ammon
11-Oct-2005
[525]
Hello.  I missed you all as well.
Graham
11-Oct-2005
[526]
well, if people want to come to Wellington .. I'll see what I can 
do.
Ladislav
16-Oct-2005
[527x2]
I checked RAMBO and the discussed issue seems to be there: #3579 
by Piotr.
Gabriele: the [copy frag [n skip | to end] (insert tail result any 
[frag""])] looks too complicated for the PARSE setting words to NONE 
justification, I would at least prefer to add it to the ticket as 
an example.
MichaelB
23-Oct-2005
[529]
I just found out that I can't do the following:
s: "a b c"
s: "a c b"
parse s ["a" to ["b" | "c"] to end]


The two strings should only symbolize that b and c can alternate. 
But 'to and 'thru don't work with subrules. It's not even stated 
in the documentation that it should but wouldn't it be natural ? 
Or am I missing some complication for the parser if it would support 
this (in the general case indefinite look-ahead necessary for the 
parser - is this the problem?) ? How are other people doing things 
like this - what if you want to parse something like "a bla bla bla 
c" or "a bla bla bla d" if you are interested in the "bla bla bla" 
which might be arbitrary text and thus can't be put into rules ?
Volker
23-Oct-2005
[530x2]
Carl mentioned performance-problems. Although everyone asks for it.
i use 
 "a" any[ "b" | "c" | skip ] to end
Even slower and less elegantly, but works.
MichaelB
23-Oct-2005
[532]
OK, thanks. Didn't know this. But this solution will work for me 
as well. In a sense this is interesting, as skip isn't a real token, 
but a command - but it's treated as a token. :-)
Chris
23-Oct-2005
[533x2]
Volker, that will return true for "a" as well as "a b c".
I tend to use charsets for this, though again, there is probably 
a performance cost...
Volker
23-Oct-2005
[535x2]
true. never thought about that.
'some would do half the trick. except if nothing is found, it eats 
all and counts as true here.
Chris
23-Oct-2005
[537]
non-bc: complement charset "bc"
parse "a b c" ["a" any non-bc ["b" | "c"] to end]
Volker
23-Oct-2005
[538]
but if it is longer, like "be", every "bi" would fail to.
Chris
23-Oct-2005
[539]
In that case, you need to elaborate a little.
Volker
23-Oct-2005
[540x2]
s: "abibet"
parse s ["a" any non-bc ["be" | "ce"] to end]
More complex than i thought, with that missing thing.
Chris
23-Oct-2005
[542]
Complex indeed.
Volker
23-Oct-2005
[543]
Maybe i used this?
 parse s ["a" some[ "be" break | "ce" break | skip] p: to end]

if nothing is found, it skips to the end. returns true, but if you 
require something after it,
that fails (because already at end).
Izkata
23-Oct-2005
[544]
Michael>I just found out that I can't do the following:
s: "a b c"
s: "a c b"
parse s ["a" to ["b" | "c"] to end]<

parse s ["a" [to "b" | to "c"] to end]
Chris
23-Oct-2005
[545]
Iskata, that breaks if the "c" comes before the "b".
Izkata
23-Oct-2005
[546]
I agree, it should work the other way, too, though..
Chris
23-Oct-2005
[547]
Iz -- d'oh...
Izkata
23-Oct-2005
[548x2]
^.^
But isn't that what is wanted?  (to ["b" | "c"])
Chris
23-Oct-2005
[550x3]
V: perhaps better in this case to use 'while and 'find rather than 
'parse?
Izkata:
>> non-bc: complement bc: charset "bc"
== make bitset! 64#{////////////////8/////////////////////////8=}
>> s1: "a b c"
== "a b c"
>> s2: "a c b"
== "a c b"
>> parse s1 ["a" [to "b" | to "c"] mk: to end] mk
== "b c"
>> parse s2 ["a" [to "b" | to "c"] mk: to end] mk
== "b"
>> parse s1 ["a" any non-bc mk: ["b" | "c"] to end] mk
== "b c"
>> parse s2 ["a" any non-bc mk: ["b" | "c"] to end] mk
== "c b"
Note the difference when parsing 's2...