r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Chris
5-Nov-2008
[2789]
Would a) work?  Would b) reset the string as the first rule didn't 
match?
BrianH
5-Nov-2008
[2790]
a) would work.

b) would not likely reset the string, just like code blocks don't 
undo.
BrianH
6-Nov-2008
[2791x3]
You might be able to do b) like this:

 parse "abcdef123" [use [a] [remove ["abc" a: "123" :a] | remove ["abcd" 
 a: "ef123" :a] to end]]
or like this:

 parse "abcdef123" [use [a] [remove ["abc" a: "123" :a | "abcd" a: 
 "ef123" :a] to end]]
The standard method of putting the longest match first will still 
be the best.
Ooo, I just figured out another possibility for b):

 parse "abcdef123" [to "123" reverse remove ["abc" end] | to "ef123" 
 reverse remove ["abcd" end]]
Chris
6-Nov-2008
[2794]
parse "[this]" [remove "[" to end reverse remove ["]"]]
BrianH
6-Nov-2008
[2795x2]
In reverse, the end is the beginning :)
And the beginning is the end :)
Chris
6-Nov-2008
[2797x2]
Shorter with tail, perhaps?

parse "[this]" [remove "[" tail reverse remove "]"]
How about this?

	parse "abc" ["a" to end reverse "bc"]
Graham
6-Nov-2008
[2799]
Is there anything that can be done to assist block parsing?
BrianH
6-Nov-2008
[2800]
Yup, that would work.
Graham
6-Nov-2008
[2801x2]
If you turn a string into a block and there's something present which 
is an illegal rebol value ... you get an error.
Is there a way such values can be turned into a legal type ....and 
not get an error?
BrianH
6-Nov-2008
[2803]
That sounds like unbound words. In R3 unbound words are not an illegal 
type.
Graham
6-Nov-2008
[2804x2]
great.
so if we have words like asdfs@ etc, we don't get illegal email types 
etc?
BrianH
6-Nov-2008
[2806x3]
In R2 you just bind the words to system/words, or LOAD instead of 
TO-BLOCK.
That's a different issue, that's illegal REBOL syntax. You are out 
of luck there.
Just don't try to parse illegal REBOL syntax as if it were legal 
and you'll be fine. You have string parse for non-REBOL strings.
Chris
6-Nov-2008
[2809]
Brian, would the above example return true, as the entire string 
has been matched?  Or just work in that it matches the "bc"?
Graham
6-Nov-2008
[2810]
well, if you have a data input screen ... then the user can enter 
all sorts of stuff
BrianH
6-Nov-2008
[2811x5]
Unless that user is trained to enter the right syntax, you'll have 
to clean up before you can load the string.
Chris, the above example would not return true.
String parse is a good way to clean up input :)
Graham, the parser that turns strings of REBOL syntax into REBOL 
data is LOAD, not PARSE. If you want to clean up those strings or 
loosen the syntax, LOAD is the function to change.
On the other hand, you can write a parser in string PARSE that would 
read data in whatever format you like.
Graham
6-Nov-2008
[2816x3]
I was asking if parsing data could be made easier by altering the 
way values are turned into rebol types
>> to-block " @"
** Syntax Error: Invalid email -- @
** Near: (line 1) @
string parsing is so tedious!
BrianH
6-Nov-2008
[2819x2]
Now that is interesting. LOAD and TO-BLOCK do that. It would be interesting 
to write a set of rules in PARSE that read REBOL syntax and generate 
REBOL data or warnings instead of errors. For your purposes you might 
consider LOAD/NEXT in a loop inside of TRY blocks.
If you write your own handler for structural delimiters to build 
your own blocks you can use load/next for the simple literal values.
Graham
6-Nov-2008
[2821]
Perhaps to block can convert "illegal" values into strings instead 
of flagging an error?
BrianH
6-Nov-2008
[2822x2]
Personally, I just prefer string parsing with PARSE, but I'm weird 
that way. We're trying to make PARSE less tedious though.
I would rather TO-BLOCK convert illegal values into error values, 
of the syntax error type.
Graham
6-Nov-2008
[2824]
oh well, make it an option for those of us who prefer diffferently
BrianH
6-Nov-2008
[2825x3]
Not up to me. I think that the decision was that relaxing REBOL syntax 
was a slippery slope that would lead to uncatchable syntax errors. 
The syntax error mechanism is one of the most valuable debugging 
tools we have. You might need to consider the possibility that if 
what your users are entering isn't legal REBOL syntax, it isn't REBOL 
syntax and needs a different parser.
REBOL syntax is weird by user standards.
It might be better to come up with a good set of flexible predefined 
PARSE rules for data entry that can be reused without much trouble 
by a variety of programmers. That would solve the problem for everyone 
without making it difficult for REBOL debugging.
Graham
6-Nov-2008
[2828]
to-block/relax :)
BrianH
6-Nov-2008
[2829]
You can't do that. TO-BLOCK is a wrapper for TO BLOCK! val and TO 
is an action! with a fixed arity. What you want is a parser for user 
data that isn't in REBOL syntax. That we can do.
btiffin
6-Nov-2008
[2830]
Graham; I've been trying to convince Carl for this   TRANSCODE in 
R3 doesn't throw syntax errors.
Graham
6-Nov-2008
[2831]
what's transcode?
Chris
6-Nov-2008
[2832]
Graham, my 'import (and 'as) function is kindof addresses this.  
I've also been working on loading non-rebol data (in some cases numbers 
that use , -- 1,000 or us-style dates mm/dd/yyyy).  This'll be even 
better with some transformational chops in parse...
btiffin
6-Nov-2008
[2833]
The new LOAD/NEXT
Graham
6-Nov-2008
[2834x2]
I agree that things should be easier to parse - and that includes 
preparing data to be parsed.
Having predefined parse rules is just more stuff to remember.
btiffin
6-Nov-2008
[2836]
Oldes? Henrik? Maarten? posted a nice all-value loader for me, its 
in some of the construction boss code, but I don't have that source 
ported to this machine yet...  I didn't learn anything, as I just 
cut'n'pasted the snippet.
Graham
6-Nov-2008
[2837]
any-one!
BrianH
6-Nov-2008
[2838]
It's not more to remember, it's just a module to load :)