World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Graham 29-Jan-2010 [4835] | this is what I get for item e>2010-01-29T11:06:18.000ZI3tFREFCRUYwNTY4OTdBMzcwODM2NzJGQUE5MzAwRUE3NjYwMTMwMTY5fQ==</Name><Attribute><Name>Subject</Name><Value>Index working</Value></Attribute><Attribute><Name>Userid</Name><Value>Graham</Value></Attribute><Attribute><Name>UTCDate</Name><Value>2010-01-29T11:06:18.000Z</Value></Attribute> |
Steeve 29-Jan-2010 [4836x2] | >> parse "<a><item>" [thru <a> ??] end!: "item>" == false |
a bug | |
Graham 29-Jan-2010 [4838] | I'm not familiar with that ... what should it say? |
Steeve 29-Jan-2010 [4839x2] | It should say: >> parse "<a><item>" [thru <a> ??] end!: "<item>" == false |
parsing thru a tag eat one more char | |
Graham 29-Jan-2010 [4841] | Ah .. ?? is a new debugging function |
Steeve 29-Jan-2010 [4842] | yep |
Graham 29-Jan-2010 [4843x2] | Should have known about it last night! Would have saved me sometime :( |
Well, this looks like an unreported bug ... | |
Steeve 29-Jan-2010 [4845] | exactly |
Graham 29-Jan-2010 [4846] | Shall you or I curecode it? |
Steeve 29-Jan-2010 [4847x2] | you |
;-) | |
Graham 29-Jan-2010 [4849x2] | okey dokey |
Now I know I can't use r3 for parsing xml .... :( http://www.curecode.org/rebol3/ticket.rsp?id=1449 | |
Steeve 29-Jan-2010 [4851] | you can, just replace <tag> by a real string "<tag>" |
Graham 29-Jan-2010 [4852x4] | ugly ! :) |
Point taken ... | |
Is there any likelihood of the parse enhancements making it to r2? Anyone know? | |
( without the bugs of course ) | |
Steeve 29-Jan-2010 [4856] | 0% |
BrianH 29-Jan-2010 [4857x2] | And there is a great likelihood of the bugs being fixed in R3. And there aren't many in PARSE, just that tag bug afaik. |
Graham, I deleted bug #1449 since it was already reported as #682. See also #854 and #1160 (and #10, which was incorrectly "fixed"). | |
Graham 29-Jan-2010 [4859] | your response says it was fixed ... |
BrianH 29-Jan-2010 [4860] | Partially - it used to be worse. That's why it's marked a "problem". |
Graham 29-Jan-2010 [4861] | only eats one char instead of two ... so that's a 50% improvement |
BrianH 29-Jan-2010 [4862x2] | The worst was when someone "fixed" #10 to make it compatible with R2's buggy behavior. Bad fixes get marked as a problem. |
Check out #666 for R3's official policy on bug-for-bug compatibility :) | |
Graham 29-Jan-2010 [4864] | at least it should not introduce new bugs |
BrianH 29-Jan-2010 [4865] | Agreed (and the policy agrees too). |
Graham 29-Jan-2010 [4866] | I looked for a previous report on this bug but couldn't find it .. 4 pages of bugs with parse in them. I wonder if they can be filtered to only show active bugs |
BrianH 29-Jan-2010 [4867] | Bring it up in the !CureCode group. |
Graham 7-Feb-2010 [4868x2] | I want to extract all the dates ( dd-mmm-yy, dd mmm yyyy d mmmmmmm yy ) extract-dates: func [ txt /local months dates days month year ][ dates: copy [] months: copy [] digit: charset [ #"0" - #"9" ] digits: [ some digit ] foreach mon system/locale/months [ repend months [ mon '| copy/part mon 3 '| ] ] remove back tail months parse txt [ some [ to 1 2 digits copy days 1 2 digit [ #" " | #"-" ] copy month months [ #" " | #"-" ] copy year [ 4 digits | 2 digits ] ( repend dates rejoin [ days "-" month "-" year ] ) | thru 1 2 digits ?? ] ] dates ] extract-dates "asdf sdfsf 11 Jan 2008 12-January-10 fasdfsaf asdf as 11 2 3 3 13-Feb-08 asdfasf " |
not working ... | |
Steeve 7-Feb-2010 [4870] | R2 or R3 ? In any case, the first rule may fail. you can't do "TO 1 2 digits" |
BrianH 7-Feb-2010 [4871] | TO and THRU have limited argument syntax, and don't support full rules. Both R2 and R3 support literal value arguments (that don't count as rules). R3 also supports a block of literal values delimited by |, and those values are less limted. |
Steeve 7-Feb-2010 [4872x2] | Something weird ! Using a simple charset with TO or THRU should work. But it fail here with R3. digits: charset "134567890" Something weird ! Using a simple charset with TO or THRU should work. But it fail here with R3. >> digits: charset "134567890" >> parse "azaz 34" [to digits ??] end!: "azaz 34" |
Oh my !!!!! It fail with R2 now too... | |
Graham 7-Feb-2010 [4874] | R2 & R3 ... I tried nondigit: complement digit nondigits: [ some nondigit ] some [ any nondigits 1 2 .... ] but it gets stuck on the year |
BrianH 7-Feb-2010 [4875] | Steeve, that's a bug that I reported yesterday. |
Graham 7-Feb-2010 [4876] | I was using r3 as it's easier to trace the parse ... but perhaps i shouldn't! |
Steeve 7-Feb-2010 [4877] | Maybe i'm wrong ,I can't remember if TO or THRU ever worked with charsets. Alzheimer catches me... |
Graham 7-Feb-2010 [4878] | XRatio is right .. parse is too difficult! |
Steeve 7-Feb-2010 [4879] | hehe |
Gabriele 7-Feb-2010 [4880] | to/thru never worked with charsets. that's why we always have those complements... :) |
BrianH 7-Feb-2010 [4881] | Oh crap. Well, it was reported as a bug, and it's staying that way until Carl says otherwise :) |
Gabriele 7-Feb-2010 [4882] | given that to and thru do "more" in R3, it probably is not bad to consider it a bug. (maybe it should be considered a bug in R2 as well, given that FIND does work with charsets...) |
BrianH 7-Feb-2010 [4883] | Carl seems to think that he can add TO or THRU QUOTE value to block parsing too. |
Graham 7-Feb-2010 [4884] | this works extract-dates: func [ txt /local months dates days month year ][ dates: copy [] months: copy [] digit: charset [ #"0" - #"9" ] digits: [ some digit ] nondigit: complement digit nondigits: [ some nondigit ] foreach mon system/locale/months [ repend months [ mon '| copy/part mon 3 '| ] ] separator: [ #" " | #"-" ] remove back tail months date-rule: [ copy days 1 2 digit separator copy month months separator copy year digits ( ?? days ?? month ?? year append dates ajoin [ days "-" month "-" year ] ) ] parse txt [ some [ any nondigits [ date-rule | any digits ] ] ] dates ] |
older newer | first last |