r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Henrik
5-Mar-2008
[2443]
parse data ['string! (do-this) | 'integer (do-that)]
[unknown: 5]
5-Mar-2008
[2444x4]
I currently have resorted to a different approach.
Currently, I do something similiar to this:

user passes dynamic data captures in a block:
data: ["fname" string! "age" integer!]

Then I do the following:
        data: next data

        forskip data 2 [poke data 1 load mold/all attempt [to-datatype data/1]]
then this:

        data: head data

        unless parse data [some [string! datatype!]][return "Syntax Error!"]
Uses the mold/all method you provided yesterday.
Henrik
5-Mar-2008
[2448]
well, if that's all you do, you don't even have to convert it to 
a datatype, IMHO. it's enough to collect a list of rebol's datatypes 
as words, so they be used to trigger on the input word which looks 
like a datatype. I can see you are going for correctness, i.e. wanting 
the input to be a real datatype, but if you only use that input as 
a trigger to do something, you don't need to convert it to a real 
datatype. just operate using words.
[unknown: 5]
5-Mar-2008
[2449]
Yeah I even looked into going thru system/words and collecting all 
the datatype but the method I deployed was smaller and more effective.
Henrik
5-Mar-2008
[2450x3]
put weight in that dialects are words and if you take advantage of 
that, your dialect will become simpler
all datatype words are stored in the 'datatypes word
woah, that was nonsense: "dialects are words" I meant "dialects consists 
of words and a few other things"
[unknown: 5]
5-Mar-2008
[2453]
lol
Henrik
5-Mar-2008
[2454]
that means: if your datatype requires some level of serialization 
syntax to work, just consider them words.
[unknown: 5]
5-Mar-2008
[2455x2]
Would be nice to have something that simply says lit-type?
except that this isn't exactly what I think of as being "lit" as 
we know it.
Henrik
5-Mar-2008
[2457x2]
dialects are separate language domains where the normal rules of 
REBOL syntax don't necessarily apply.... about lit-type, then you 
need lit-object, lit-none, lit-whatever :-)
the serialized syntax _is_ the solution to that problem. yes the 
syntax is a bit more cumbersome.
[unknown: 5]
5-Mar-2008
[2459x2]
Maybe something like this is best solution:
dlt-type?: func [w [word!]][

    foreach item datatypes [if equal? to-word item w [return true]]
    false
]
Henrik
5-Mar-2008
[2461]
or perhaps:

dlt-type?: func [w [word!]] [any [attempt [to-datatype w] false]]
[unknown: 5]
5-Mar-2008
[2462x6]
Yeah which is what I use now.
similiar anyway
More like this for my needs:
dlt-type?: func [w] [found? any [attempt [to-datatype w] false]]
drop the [word!] requirement from the argument and report true or 
false.
But that doesn't fill a rule block to be passed to parse which is 
my original intention but is still very useful.
Henrik
5-Mar-2008
[2468]
I'd still not bother with it :-) how many datatypes will you support?
btiffin
5-Mar-2008
[2469x3]
Sorry, try  [#[datatype! datatype!]  that should restrict the match 
to only datatype values.
Or not.  :)
Or yes, if the source is reduced.

parse reduce ["age" integer!] [string!  set type #[datatype! datatype!] 
(print ['got type 'type? type? type])]
Henrik
5-Mar-2008
[2472]
well, he doesn't like the serialization syntax and he won't reduce 
which is a security problem (always wise though)
btiffin
5-Mar-2008
[2473]
reduce/only is safe for that no?
Henrik
5-Mar-2008
[2474]
evaluating words can still be unsafe
btiffin
5-Mar-2008
[2475]
Gee, I guess to be secure you need   reduce/only exclude query system/words 
[integer! string! ...]
Henrik
5-Mar-2008
[2476]
or just act on words in your dialect :-)
btiffin
5-Mar-2008
[2477]
Yeah, but ...  :)
Ingo
5-Mar-2008
[2478]
I know it's already been beaten to death, but I guess you don't want 
to support all of rebols datatypes, so what is wrong with listing 
them explicitly?

>> types: ['string! | 'integer! ]                         
== ['string! | 'integer!]
>> data: ["age" integer! "name" string!]                        
== ["age" integer! "name" string!]
>> data2: ["age" integer! "name" string! "gobbledygook" object!]
== ["age" integer! "name" string! "gobbledygook" object!]
>> parse data [some [string! types]]                            
== true
>> parse data2 [some [string! types]]                           
== false
Gregg
5-Mar-2008
[2479]
I'm with Ingo on this. And as far as "being simple", this isn't really. 
:-) When I've needed to parse for datatypes, I either reduce/compose 
or set up rules for the types.
[unknown: 5]
5-Mar-2008
[2480x4]
Henrik, pretty much all of them.
Hi Ingo, I'm planning on supporting most of the REBOL datatypes which 
is very long when you consider that REBOL has 54 of them.
So setting types to all of those is not very efficient.  At this 
point using parse to do this is as Gregg said not "simple..
So my next question is if we were to wish for something to be added 
to REBOL to make this task easier and submit it to RAMBO what would 
be the best way to describe what is desired?
BrianH
5-Mar-2008
[2484x3]
We have already put together a set of requests to enhance PARSE. 
This problem could be solved by at least 3 of them.
You should probably exclude function types from your acceptable types 
to store in your database, as well as library! and a few others.
Right now, the only thing that is protecting REBOL from serialized 
functions and objects is the fact that their bindings are not deserialized 
properly. Small blessings, I guess. In the meantime, screen your 
data.
[unknown: 5]
5-Mar-2008
[2487]
Right now I have a solution in place for the database and have decided 
to continue to allow the types to be inputted.  The pro outweight 
the cons in my opinion with my application.
Gregg
6-Mar-2008
[2488]
So setting types to all of those is not very efficient.

 -- Do you mean in the parsing, or in the time it takes to set up 
 the rule(s)?
BrianH
6-Mar-2008
[2489]
You could write a script to generate the rules. It could be faster 
than writing them directly.
[unknown: 5]
6-Mar-2008
[2490]
I'm not worried about the coding, I'm concerned about the performance. 
 If I have to parse a million records or something then anything 
that cuts down on the amount of evaluation is necessary.
BrianH
6-Mar-2008
[2491x2]
I'm a little curious as to why you need to have the datatype of a 
field referenced in the record at all, if you are just using the 
REBOL data model. Wouldn't the data itself have a type? It seems 
to me that specified datatypes of fields would only need to be specified 
once per table.
This assumes that you aren't taking advantage of REBOL's type system 
to do SQLite-style manifest typing.