r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Steeve
13-Nov-2008
[3183x5]
except if change acts only when the nexrt rule is fullfilled (forget 
my remark)
it's a little be tricky to handle all the possible combinations, 
but the new commands seem really powerfull
i wonder if some CHANGE syntax combinations can be removed.
expecially those one with the post-rule modifier.

AT command should be enought to specify where the change must apply.

AT rule change value  ; to modify the index before the rule 
rule change value ;  to modify the index after the rule
and in your example, to return back even after the change.
at rule at change value
so that the CHANGE syntax could be simplified
BrianH
14-Nov-2008
[3188x2]
The AT is a separate operation that says "recognize this rule AT 
the current position but don't advance". AT doesn't specify a position.
The CHANGE syntax already has been simplified :)
Steeve
14-Nov-2008
[3190]
what do u mean ? I never said that AT need or specify a position.

My remark stay valid: change command syntax can be simplified but 
if you say that's already done. It's Ok.
BrianH
14-Nov-2008
[3191x5]
CHANGE basically needs the same information that the CHANGE REBOL 
function needs:
- A start series/position
- An end position or length
- A value to replace with
- Some options, if you need them

The CHANGE proposal has all those, and there isn't much more we can 
simplify it :)
The rule matching is a bonus :)
Kind of a necessary bonus though, since it lets you specify what 
you want to change. You need to have the change operation before 
the rule so you know where the rule starts and where it ends.
I mean that you have to have the change keyword physically before 
the rule is affects because all of the PARSE operations are prefix.
is -> it
This isn't my day :(
Chris
15-Nov-2008
[3196x5]
If I'm getting this right, OF is designed to do this:

	blk: [1 two 3.0]
	parse blk [of [integer! word! decimal!]]
	== true
	parse blk [of [number! word!]]
	== false (only accounts for one number)
	parse blk [of [word! decimal! string! issue! integer!]
	== true (can be none if a given type is missing)


I have another scenario to which the word 'of would apply.  There 
are situations where I want to match one item from a block of options. 
 Currently, those options need to be pipe-separated, requiring preprocessing 
if those options come from a data source (see languages: in my emit-rss 
script for an example).   This appears in both string and block parsing. 
 An example (using IN as the hypothetical operator):

	m28: ["Feb"]
	m30: join m28 ["Apr" "Jun" "Sep" "Nov"]
	m31: join m30 ["Jan" "Mar" "May" "Jul" "Aug" "Oct" "Dec"]
	b28: repeat x 28 [append [] 29 - x]
	b30: repeat x 30 [append [] 31 - x]
	b31: repeat x 31 [append [] 32 - x]
	parse date-str [
		in b28 "-" in m28
		| in b30 "-" in m30
		| in b31 "-" in m31
	]

This would be true for "1-Jan" "30-Sep" and false for "31-Feb".
Also, the way 'of works is a little like my 'match syntax (in QM, 
also designed for dialects), the difference being a resultant object 
instead of a block:

	match ["Click Here" #"h" http://click.here/][
		href: file! | url! | path!
		title: string!
		id: opt issue!
		class: any refinement!
		accesskey: opt char!
	]
An equivalent might be:

	parse ["Click Here" #"h" http://click.here/][
		of [
			[file! | url! | path!]
			string!
			opt issue!
			any refinement!
			opt char!
		]
	]
Though it'd raise other questions, I suppose -- what if the block 
were:

	["Click Here" #"h" other: %click/here]

It'd fail, as set-word! is not included in the spec?
It's complex to explain (or document), but is versatile and consise 
(for what it does).
Graham
15-Nov-2008
[3201]
any-type!
Chris
15-Nov-2008
[3202]
The point is though that I'd want it to fail.  The set-word! could 
be used as a delimiter:

	[link-one: %file-one "File One" link-two: %file-two "File Two"]

Would be matched by:

	some [set-word! of link-spec]	

Or in VID:

	some [opt set-word! word! face-spec]
Steeve
15-Nov-2008
[3203x2]
Hmm Chris, what is your request actually ?
i wonder if delect is not more usefull in your case
Chris
15-Nov-2008
[3205x3]
Perhaps, but I thought incorporating 'delect was part of the point 
of 'of
Steeve, two requests -- matching from a block! and a slightly more 
nuanced 'of
Both based on situations I've come upon.
Steeve
15-Nov-2008
[3208x2]
matching from a block! .... isn't it already the case ?
i mean in the wiki definition
Chris
15-Nov-2008
[3210]
No, as above (you asked me to summarize).
BrianH
17-Nov-2008
[3211x7]
Chris, re: your more nuanced OF, that is covered in the existing 
proposal (including Steeve's alternate and Carl's possible future 
extensions). Carl will have to determine how flexible OF can be implemented, 
without having diminishing returns on increased complexity.
About your matching from a block proposal, if the CHECK proposal 
gets accepted then I doubt this will - the usage scenarios where 
you can't just use alternates would be too rare, especially given 
how easy CHECK (FIND ...) could do the job in those cases.
Your example with alternates (and bug fixes, still ignoring leap 
years):


 m31: ["Jan" | "Mar" | "May" | "Jul" | "Aug" | "Oct" | "Dec"]  ; joins 
 were in wrong direction
	m30: join m31 [| "Apr" | "Jun" | "Sep" | "Nov"]
	m28: join m30 [| "Feb"]

 b28: next repeat x 28 [repend [] ['| form x]]  ; next to skip leading 
 |, numbers don't work in string parsing
	b30: ["29" | "30"]  ; optimization based on above reversed joins
	b31: ["31"]
	parse date-str [
		b28 "-" m28
		| b30 "-" m30
		| b31 "-" m31
	]

The above with CHECK instead:

	m31: ["Jan" "Mar" "May" "Jul" "Aug" "Oct" "Dec"]
	m30: join m31 ["Apr" "Jun" "Sep" "Nov"]
	m28: join m30 ["Feb"]
	b28: repeat x 28 [append [] form x]  ; not assuming 
	b30: ["29" "30"]  ; optimization based on above reversed joins
	b31: ["31"]
	parse date-str [
		copy d some digit "-" copy m some alpha
		check (	any [
			all [find b31 d  find m31 m]
			all [find b30 d  find m30 m]
			all [find b28 d  find m28 m]
		])
	]

Which would be faster would depend on the data and scenario.
(the comments on the second example can be ignored)
Your proposal seems like a slightly faster but more limited version 
of alternates, and not as flexible or optimizable as check. Does 
this situation come up so often that you need direct support for 
it?
Here's a simpler date checker with CHECK:


parse date-str [copy d [1 2 digit "-" 3 alpha "-" 4 digit] check 
(attempt [to-date d])]
That requires years too, but at least it gets leap year 29-Feb.
Gabriele
17-Nov-2008
[3218]
Brian, JOIN does a REDUCE on the second block.
BrianH
17-Nov-2008
[3219]
Right you are, whoops. It's been a while since I used it with blocks.
Chris
18-Nov-2008
[3220x3]
'append would do it...

numbers don't work in string parsing

 - I thought about this when I developed the example, thought it might 
 be possible as the numbers appear outside the dialect.  But 'check 
 seems like the better option.  

joins were in the wrong direction
 - d'oh!

simpler date checker

 - that's only useful if to-date recognizes the date format : )  (and 
 using dates was illustrative - there are other situations with similar 
 needs).  Though on dates, what would be the most succinct way with 
 the proposals on the table to do the following?

	ameridate: "2/15/2008"
	parse ameridate ...rule...
	newdate = 15-Feb-2008

One attempt:

	parse ameridate [
		use [d m][
			change [copy m 1 2 digit "/" copy d 1 2 digit]
			(rejoin [d "/" m])
		]
		"/" 4 digit end check (newdate: to-date ameridate)
	]
(making the assumption it is a valid date)
(and that it's ok tomodify the original string)
eFishAnt
22-Nov-2008
[3223x4]
If I am parsing something like javascript that has { and } in it 
like C, how can I put that into a string to parse without using {}
my test cases for testing a dialect, I usually use this form:
print parse/all {test case that 
can't have {} inside } parse-rule
There must be some dead-simple guru trick for this...
Sunanda
22-Nov-2008
[3227]
I think I raised the same question on the ML years ago, and got  
a disappointing answer. Maybe things have changed since. Or, if not, 
it may not be too late to add to the R3 parse wishlist:
    http://www.rebol.org/ml-display-thread.r?m=rmlSQHQ
eFishAnt
22-Nov-2008
[3228x3]
My first inclination is to "go binary" on it...;-) but that is inelegant, 
and the MSB of binary gets wonky sometimes.
(MSB bit is a sign bit oftentimes)
I was thinking of taking away the special meaning of { but not yet 
sure how to unset it...it initially seems hardcoded in there....not 
like I would expect.
Oldes
22-Nov-2008
[3231]
I don't understand what do you mean.
eFishAnt
22-Nov-2008
[3232]
In the console if you type a {  and then hit Enter, it continues 
on the next line.

:{   and }:   don't seem to work, either.