World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
btiffin 6-Nov-2008 [2830] | Graham; I've been trying to convince Carl for this TRANSCODE in R3 doesn't throw syntax errors. |
Graham 6-Nov-2008 [2831] | what's transcode? |
Chris 6-Nov-2008 [2832] | Graham, my 'import (and 'as) function is kindof addresses this. I've also been working on loading non-rebol data (in some cases numbers that use , -- 1,000 or us-style dates mm/dd/yyyy). This'll be even better with some transformational chops in parse... |
btiffin 6-Nov-2008 [2833] | The new LOAD/NEXT |
Graham 6-Nov-2008 [2834x2] | I agree that things should be easier to parse - and that includes preparing data to be parsed. |
Having predefined parse rules is just more stuff to remember. | |
btiffin 6-Nov-2008 [2836] | Oldes? Henrik? Maarten? posted a nice all-value loader for me, its in some of the construction boss code, but I don't have that source ported to this machine yet... I didn't learn anything, as I just cut'n'pasted the snippet. |
Graham 6-Nov-2008 [2837] | any-one! |
BrianH 6-Nov-2008 [2838] | It's not more to remember, it's just a module to load :) |
Graham 6-Nov-2008 [2839] | Can you give me an example of what you are thinking of? |
Tomc 6-Nov-2008 [2840] | the ability to remove/override/extend the set of chars considered "special" in string splitting |
Graham 6-Nov-2008 [2841x2] | like " ? |
some sql dialects have the handy abilty to set the current delimiter | |
Tomc 6-Nov-2008 [2843x3] | the abilitty to "recognize all datadypes not just a frw like integer! tag! in string parsing |
yes Graham quote is one that can be hard to work around or show up unexpectedly | |
comma and space seem to be others | |
Graham 6-Nov-2008 [2846] | block parsing .. you can't match specific integers |
Pekr 6-Nov-2008 [2847] | Graham - there is going to be a LIT keyword to match integers .... |
Tomc 6-Nov-2008 [2848x5] | this may bot be a well formed thought but ... the ability to directly feed a port into parse |
maybe something like | |
parse open txcp://where.ever:555 rule | |
where the closing port would determine when if ever parse was finished | |
prt: open ... parse/port prt [ ... insert prt something ...] | |
Pekr 6-Nov-2008 [2853] | hmm, continuous parse ... there was my request for it long time ago, and IIRC even Carl said, that it might be usefull. Imagine for e.g. encoders ... You read stream from file in chunks, and you evaluate. The problem is, when you need to backtrack, you would need to cache the stream. Dunno, if something like that is possible at all ... |
sqlab 6-Nov-2008 [2854] | I don't like the result of none if parsing strings. parse/all ",,," [(block: copy [] ) any [thru "," copy a to "," (append block a)] to end ] should give >> block == ["" ""] and not == [none none] |
Sunanda 6-Nov-2008 [2855] | My suggested improvement to parse would be a trace (or debug) refinement: trace-output-word: copy [] parse/trace string rules trace-output-word I'm not entirely sure how it would work. That would depend in part on how parse works internally, and so what trace points are possible. But, as a minimum, I'd expect it to show me each rule that triggers a match, and the current position of the string being parsed. parse would append trace info to the trace-output word Otherwise, parse is too big a black box for any one other than very patient experts. |
Anton 6-Nov-2008 [2856] | Buffered parse - that should be added to the Parse Project DocBase page. It's a big one, though. |
Dockimbel 6-Nov-2008 [2857x2] | Streamed parsing with backtracking : sure, it's possible, I'm doing that in postgresql driver since 2001 and more recently in the experimental async mysql driver release last year. (It's not done in a easily reusable way, thought). |
Tracing parse: IMHO, it would more efficient to add a PARSE mezz for parse rules debugging purpose. (That requires to emulate PARSE command, which is not a difficult task.). | |
BrianH 6-Nov-2008 [2859x2] | Tracing support would be good to add, but start it and stop it with the TRACE native. |
Tomc, parsing an open port has been a wish of mine for years. In theory you could handle backtracking with buffering. | |
Pekr 6-Nov-2008 [2861] | BrianH: how do you know how much to store in cache and whent o flush it? |
Robert 6-Nov-2008 [2862] | Just cache the whole thing ;-) |
BrianH 6-Nov-2008 [2863] | Well, when PARSE enters a block it saves the position at the start of that block. If you have to backtrack that is as far back as you would need to cache. PARSE could optimize this by not saving backtracking info unless there is an alternate later on in the block. You could then minimize caching in some cases by rearranging your rules to use as little backtracking as possible, and none at the top level. |
Pekr 6-Nov-2008 [2864] | :-) Very usefull for xy MB streamed video :-) |
BrianH 6-Nov-2008 [2865] | I was thinking streamed XML, but yes :) |
Anton 6-Nov-2008 [2866x2] | Interesting, Brian. I was going to suggest: |
DISPENSE: Parse command to mark points in the data which don't need backtracking past. Parse can use this information to dispense with older buffer data no longer needed. Otherwise it holds and accumulates the data. This would be used for very large or unbound length data streams. eg. internet radio. | |
BrianH 6-Nov-2008 [2868] | Also, I was thinking of parsing file ports opened with seek mode. |
Pekr 6-Nov-2008 [2869] | exactly .... |
Anton 6-Nov-2008 [2870] | Yes, that's another mode, suitable for files (but not internet radio). |
Pekr 6-Nov-2008 [2871] | I was thinking about Amiga like datatypes, done in REBOL. Such decoders could be slow though .... |
BrianH 6-Nov-2008 [2872] | Interesting, but you wouldn't need DISPENSE if your rules don't have alternates to backtrack to (statically determinable). |
Anton 6-Nov-2008 [2873x2] | Yeah, I hadn't thought of that. |
Perhaps there are cases where alternates are not a good method of determining when to dispense buffer data ? | |
BrianH 6-Nov-2008 [2875x2] | Of course "statically determinable" means that you wouldn't be able to modify the rule block that PARSE is currently working on (which would likely crash PARSE anyways). |
Well, if you have no alternate, you have no backtracking, so you can dispose on the way. | |
Anton 6-Nov-2008 [2877] | What about REVERSE ? |
BrianH 6-Nov-2008 [2878] | Ah, that would require buffering. Darn. |
Anton 6-Nov-2008 [2879] | And also set-words.. |
older newer | first last |