r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Steeve
29-Jan-2010
[4840]
parsing thru a tag eat one more char
Graham
29-Jan-2010
[4841]
Ah .. ?? is a new debugging function
Steeve
29-Jan-2010
[4842]
yep
Graham
29-Jan-2010
[4843x2]
Should have known about it last night!  Would have saved me sometime 
:(
Well, this looks like an unreported bug ...
Steeve
29-Jan-2010
[4845]
exactly
Graham
29-Jan-2010
[4846]
Shall you or I curecode it?
Steeve
29-Jan-2010
[4847x2]
you
;-)
Graham
29-Jan-2010
[4849x2]
okey dokey
Now I know I can't use r3 for parsing xml .... :(

http://www.curecode.org/rebol3/ticket.rsp?id=1449
Steeve
29-Jan-2010
[4851]
you can, just replace <tag> by a real string "<tag>"
Graham
29-Jan-2010
[4852x4]
ugly !  :)
Point taken ...
Is there any likelihood of the parse enhancements making it to r2? 
 Anyone know?
( without the bugs of course )
Steeve
29-Jan-2010
[4856]
0%
BrianH
29-Jan-2010
[4857x2]
And there is a great likelihood of the bugs being fixed in R3. And 
there aren't many in PARSE, just that tag bug afaik.
Graham, I deleted bug #1449 since it was already reported as #682. 
See also #854 and #1160 (and #10, which was incorrectly "fixed").
Graham
29-Jan-2010
[4859]
your response says it was fixed ...
BrianH
29-Jan-2010
[4860]
Partially - it used to be worse. That's why it's marked a "problem".
Graham
29-Jan-2010
[4861]
only eats one char instead of two ... so that's a 50% improvement
BrianH
29-Jan-2010
[4862x2]
The worst was when someone "fixed" #10 to make it compatible with 
R2's buggy behavior. Bad fixes get marked as a problem.
Check out #666 for R3's official policy on bug-for-bug compatibility 
:)
Graham
29-Jan-2010
[4864]
at least it should not introduce new bugs
BrianH
29-Jan-2010
[4865]
Agreed (and the policy agrees too).
Graham
29-Jan-2010
[4866]
I looked for a previous report on this bug but couldn't find it .. 
4 pages of bugs with parse in them.  I wonder if they can be filtered 
to only show active bugs
BrianH
29-Jan-2010
[4867]
Bring it up in the !CureCode group.
Graham
7-Feb-2010
[4868x2]
I want to extract all the dates ( dd-mmm-yy, dd mmm yyyy d mmmmmmm 
yy )


extract-dates: func [ txt 
	/local months dates days month year
][
	dates: copy []
	months: copy []
	digit: charset [ #"0" - #"9" ]
	digits: [ some digit ]
	foreach mon system/locale/months [
		repend months [ mon '|  copy/part mon 3 '| ]
	]
	remove back tail months
	parse txt [
		some [
			to 1 2 digits copy days 1 2 digit [ #" " | #"-" ]
			copy month months
			[ #" " | #"-" ]
			copy year [ 4 digits | 2 digits ]
			( repend dates rejoin [ days "-" month "-" year ] ) |
			thru 1 2 digits ??
		]
	]
	dates
]


extract-dates "asdf sdfsf  11 Jan 2008 12-January-10 fasdfsaf asdf 
as 11 2 3 3  13-Feb-08 asdfasf "
not working ...
Steeve
7-Feb-2010
[4870]
R2 or R3 ?
In any case, the first rule may fail.
you can't do "TO 1 2 digits"
BrianH
7-Feb-2010
[4871]
TO and THRU have limited argument syntax, and don't support full 
rules. Both R2 and R3 support literal value arguments (that don't 
count as rules). R3 also supports a block of literal values delimited 
by |, and those values are less limted.
Steeve
7-Feb-2010
[4872x2]
Something weird !
Using a simple charset with TO or THRU should work.
But it fail here with R3.

digits: charset "134567890"

Something weird !
Using a simple charset with TO or THRU should work.
But it fail here with R3.

>> digits: charset "134567890"
>> parse "azaz 34" [to digits ??]
end!: "azaz 34"
Oh my !!!!!
It fail with R2 now too...
Graham
7-Feb-2010
[4874]
R2 & R3 ... I tried
nondigit: complement digit nondigits: [ some nondigit ]

some [
	any nondigits 1 2 ....
]

but it gets stuck on the year
BrianH
7-Feb-2010
[4875]
Steeve, that's a bug that I reported yesterday.
Graham
7-Feb-2010
[4876]
I was using r3 as it's easier to trace the parse ... but perhaps 
i shouldn't!
Steeve
7-Feb-2010
[4877]
Maybe i'm wrong ,I can't  remember if TO or THRU ever worked with 
charsets.
Alzheimer catches me...
Graham
7-Feb-2010
[4878]
XRatio is right .. parse is too difficult!
Steeve
7-Feb-2010
[4879]
hehe
Gabriele
7-Feb-2010
[4880]
to/thru never worked with charsets. that's why we always have those 
complements... :)
BrianH
7-Feb-2010
[4881]
Oh crap. Well, it was reported as a bug, and it's staying that way 
until Carl says otherwise :)
Gabriele
7-Feb-2010
[4882]
given that to and thru do "more" in R3, it probably is not bad to 
consider it a bug. (maybe it should be considered a bug in R2 as 
well, given that FIND does work with charsets...)
BrianH
7-Feb-2010
[4883]
Carl seems to think that he can add TO or THRU QUOTE value to block 
parsing too.
Graham
7-Feb-2010
[4884x3]
this works 


extract-dates: func [ txt 
	/local months dates days month year
][
	dates: copy []
	months: copy []
	digit: charset [ #"0" - #"9" ]
	digits: [ some digit ]
	nondigit: complement digit
	nondigits: [ some nondigit ]
	foreach mon system/locale/months [
		repend months [ mon '|  copy/part mon 3 '| ]
	]
	separator: [ #" " | #"-" ] 
	remove back tail months

 date-rule: [ copy days 1 2 digit separator copy month months separator 
 copy year digits (
		?? days ?? month ?? year
		append dates ajoin [ days "-" month "-" year ] 
		)
	]
	parse txt [
		some [
			any nondigits [ date-rule | any digits ]
		]
	]
	dates
]
extract-dates "asdf sdfsf 1 11 Jan 2008 12-January-10 fasdfsaf asdf 
as 11 2 3 3  13-Feb-08 asdfasf "
days: "11"
month: "Jan"
year: "2008"
days: "12"
month: "January"
year: "10"
days: "13"
month: "Feb"
year: "08"
== ["11-Jan-2008" "12-January-10" "13-Feb-08"]
ahh... correction, it works under R3 and locks up in R2 :(
Graham
8-Feb-2010
[4887]
and finally a parse rule that works under r2 and r3

	parse/all txt [
		some [
			[ end | any nondigits ] [ date-rule | some digits  ] 
		]
	]
Sunanda
13-Apr-2010
[4888]
Parse help needed here:

  http://stackoverflow.com/questions/2631125/change-part-doesnt-work-as-expected-with-parse
Ladislav
13-Apr-2010
[4889]
His style looks strange