r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Graham
6-Nov-2008
[2818]
string parsing is so tedious!
BrianH
6-Nov-2008
[2819x2]
Now that is interesting. LOAD and TO-BLOCK do that. It would be interesting 
to write a set of rules in PARSE that read REBOL syntax and generate 
REBOL data or warnings instead of errors. For your purposes you might 
consider LOAD/NEXT in a loop inside of TRY blocks.
If you write your own handler for structural delimiters to build 
your own blocks you can use load/next for the simple literal values.
Graham
6-Nov-2008
[2821]
Perhaps to block can convert "illegal" values into strings instead 
of flagging an error?
BrianH
6-Nov-2008
[2822x2]
Personally, I just prefer string parsing with PARSE, but I'm weird 
that way. We're trying to make PARSE less tedious though.
I would rather TO-BLOCK convert illegal values into error values, 
of the syntax error type.
Graham
6-Nov-2008
[2824]
oh well, make it an option for those of us who prefer diffferently
BrianH
6-Nov-2008
[2825x3]
Not up to me. I think that the decision was that relaxing REBOL syntax 
was a slippery slope that would lead to uncatchable syntax errors. 
The syntax error mechanism is one of the most valuable debugging 
tools we have. You might need to consider the possibility that if 
what your users are entering isn't legal REBOL syntax, it isn't REBOL 
syntax and needs a different parser.
REBOL syntax is weird by user standards.
It might be better to come up with a good set of flexible predefined 
PARSE rules for data entry that can be reused without much trouble 
by a variety of programmers. That would solve the problem for everyone 
without making it difficult for REBOL debugging.
Graham
6-Nov-2008
[2828]
to-block/relax :)
BrianH
6-Nov-2008
[2829]
You can't do that. TO-BLOCK is a wrapper for TO BLOCK! val and TO 
is an action! with a fixed arity. What you want is a parser for user 
data that isn't in REBOL syntax. That we can do.
btiffin
6-Nov-2008
[2830]
Graham; I've been trying to convince Carl for this   TRANSCODE in 
R3 doesn't throw syntax errors.
Graham
6-Nov-2008
[2831]
what's transcode?
Chris
6-Nov-2008
[2832]
Graham, my 'import (and 'as) function is kindof addresses this.  
I've also been working on loading non-rebol data (in some cases numbers 
that use , -- 1,000 or us-style dates mm/dd/yyyy).  This'll be even 
better with some transformational chops in parse...
btiffin
6-Nov-2008
[2833]
The new LOAD/NEXT
Graham
6-Nov-2008
[2834x2]
I agree that things should be easier to parse - and that includes 
preparing data to be parsed.
Having predefined parse rules is just more stuff to remember.
btiffin
6-Nov-2008
[2836]
Oldes? Henrik? Maarten? posted a nice all-value loader for me, its 
in some of the construction boss code, but I don't have that source 
ported to this machine yet...  I didn't learn anything, as I just 
cut'n'pasted the snippet.
Graham
6-Nov-2008
[2837]
any-one!
BrianH
6-Nov-2008
[2838]
It's not more to remember, it's just a module to load :)
Graham
6-Nov-2008
[2839]
Can you give me an example of what you are thinking of?
Tomc
6-Nov-2008
[2840]
the ability to remove/override/extend the set of chars considered 
"special"  in  string splitting
Graham
6-Nov-2008
[2841x2]
like " ?
some sql dialects have the handy abilty to set the current delimiter
Tomc
6-Nov-2008
[2843x3]
the abilitty to "recognize all datadypes not just a frw like  integer! 
tag! in string parsing
yes Graham quote is one that can be hard to work around or show up 
unexpectedly
comma and space  seem to be others
Graham
6-Nov-2008
[2846]
block parsing .. you can't match specific integers
Pekr
6-Nov-2008
[2847]
Graham - there is going to be a LIT keyword to match integers ....
Tomc
6-Nov-2008
[2848x5]
this may bot be a well formed thought but ... the ability to directly 
feed a port into parse
maybe something like
parse open txcp://where.ever:555 rule
where the closing port would determine when if ever  parse was finished
prt: open ...
parse/port prt [  ...
	insert  prt something
...]
Pekr
6-Nov-2008
[2853]
hmm, continuous parse ... there was my request for it long time ago, 
and IIRC even Carl said, that it might be usefull. Imagine for e.g. 
encoders ... You read stream from file in chunks, and you evaluate. 
The problem is, when you need to backtrack, you would need to cache 
the stream. Dunno, if something like that is possible at all ...
sqlab
6-Nov-2008
[2854]
I don't like the result of none if parsing strings.


parse/all ",,," [(block: copy [] ) any [thru "," copy a to "," (append 
block a)] to end ]
should give  
>> block
== ["" ""]
and not
== [none none]
Sunanda
6-Nov-2008
[2855]
My suggested improvement to parse would be a trace (or debug) refinement:
    trace-output-word: copy [] 
    parse/trace string rules trace-output-word 

I'm not entirely sure how it would work. That would depend in part 
on how parse works internally, and so what trace points are possible. 
But, as a minimum, I'd expect it to show me each rule that triggers 
a match, and the current position of the string being parsed.
 parse would  append trace info to the trace-output word


Otherwise, parse is too big a black box for any one other than very 
patient experts.
Anton
6-Nov-2008
[2856]
Buffered parse - that should be added to the Parse Project  DocBase 
page. It's a big one, though.
Dockimbel
6-Nov-2008
[2857x2]
Streamed parsing with backtracking : sure, it's possible, I'm doing 
that in postgresql driver since 2001 and more recently in the experimental 
async mysql driver release last year. (It's not done in a easily 
reusable way, thought).
Tracing parse: IMHO, it would more efficient to add a PARSE mezz 
for parse rules debugging purpose. (That requires to emulate PARSE 
command, which is not a difficult task.).
BrianH
6-Nov-2008
[2859x2]
Tracing support would be good to add, but start it and stop it with 
the TRACE native.
Tomc, parsing an open port has been a wish of mine for years. In 
theory you could handle backtracking with buffering.
Pekr
6-Nov-2008
[2861]
BrianH: how do you know how much to store in cache and whent o flush 
it?
Robert
6-Nov-2008
[2862]
Just cache the whole thing ;-)
BrianH
6-Nov-2008
[2863]
Well, when PARSE enters a block it saves the position at the start 
of that block. If you have to backtrack that is as far back as you 
would need to cache. PARSE could optimize this by not saving backtracking 
info unless there is an alternate later on in the block. You could 
then minimize caching in some cases by rearranging your rules to 
use as little backtracking as possible, and none at the top level.
Pekr
6-Nov-2008
[2864]
:-) Very usefull for xy MB streamed video :-)
BrianH
6-Nov-2008
[2865]
I was thinking streamed XML, but yes :)
Anton
6-Nov-2008
[2866x2]
Interesting, Brian. I was going to suggest:
DISPENSE: Parse command to mark points in the data which don't need 
backtracking past. Parse can use

  this information to dispense with older buffer data no longer needed. 
  Otherwise it holds and accumulates the data.

  This would be used for very large or unbound length data streams. 
  eg. internet radio.