World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
BrianH 29-Jun-2006 [1083] | I've been using that approach for XML processing. |
Volker 29-Jun-2006 [1084] | sounds good. if one finds a good tokenized representation. I am not an xml-guru :( |
BrianH 29-Jun-2006 [1085x2] | My next personal project is to go through the XML/XSL/REST specs and create exactly that. I already have an efficient structure, I just need to fill out the semantics to support the complete logical model of XML. |
I am also not an XML guru, but I will be by the time I'm done :) | |
Volker 29-Jun-2006 [1087] | After i read " go through the XML/XSL/REST specs" ithought soo. Beeing undecised ifiprefer to run away or participate curiously. |
BrianH 29-Jun-2006 [1088x2] | Well, I know enough to know where to look to figure out the rest. |
Still, "run away" is a common and sensible reaction to XML. | |
Volker 29-Jun-2006 [1090] | *nod* |
BrianH 29-Jun-2006 [1091] | Later, I must run errands... |
Volker 29-Jun-2006 [1092] | cu |
Gordon 29-Jun-2006 [1093] | I'm a bit stuck because this parse stop after the first iteration. Can anyone give me a hint as to why it stops after one line. Here is some code: data: read to-file Readfile print length? data 224921 d: parse/all data [thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr thru newline (print index? data)] 1 == false Data contains hundreds of "memos" in a csv file with three fields: Memo, Category and Flag ("0"|"1") all fileds are enclosed in quotes and separated by commas. It would be real simple if the Memo field didn't contain double quoted words; then parse data none would even work; but alas many memos contain other "words". It would even be simple if the memos didn't contain commas, then parse data "," or parse/all data "," would work; but alas many memos contain commas in the body. |
JaimeVargas 29-Jun-2006 [1094] | Does every field is quoted? |
MikeL 29-Jun-2006 [1095] | Gordon, can you post a copy of short lines of the data? |
Izkata 29-Jun-2006 [1096] | if QuoteStr = "\"", then this looks like it to me: Note , "Category", "Flag" Note , "Category", "Flag" But you don't have a loop or anything - try this: d: parse/all data [ some [ thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr thru newline (print index? data) ] ] |
Gordon 29-Jun-2006 [1097] | James: Yes every field is quoted. Izkata: Sorry, I left that out. QuoteStr: to-char 34 probe QuoteStr == #"^"" |
Izkata 29-Jun-2006 [1098] | hm, I was thinking in C++.... very unusual for me lol |
Gordon 29-Jun-2006 [1099] | Do you need to loop? I thought parse looped by itself ie: data: parse data none |
Izkata 29-Jun-2006 [1100x2] | not as far as I know |
This change in the parse looks like it works: >> data: {"Note", "Category", "Flag" { "Note", "Category", "Flag" { "Note", "Category", "Flag" { "Note", "Category", "Flag" { } == {"Note", "Category", "Flag" Note , "Category", "Flag" Note , "Category", "Flag" Note , "Category", "Flag" } >> QuoteStr: to-char 34 == #"^"" >> d: parse/all data [ [ some [ [ X: thru QuoteStr copy Note to QuoteStr thru QuoteStr thru quotestr [ copy Category to QuoteStr thru QuoteStr thru quotestr copy Flag to QuoteStr [ thru newline (print index? :X) [ ] [ ] 1 29 57 85 == true | |
Gordon 29-Jun-2006 [1102x2] | Okay, trying it now. I see that the phrase: "print index? data" stays stuck on "1". I see that you have posted a new example. I'll try that. Be right back. |
I'm pretty sure that you are right in that I have to loop throught the "Data". That was my big stumbling block and the rest is just logic to figure out. Thanks a bunch. | |
Izkata 29-Jun-2006 [1104] | No problem (I'm glad I could actually help '^^ ) |
Gordon 29-Jun-2006 [1105x2] | In the phrase. "Print index :x", what does putting a colon before a variable do again? |
Oops I meant "Print index? :x" | |
Izkata 29-Jun-2006 [1107] | Not sure - I remember seeing it in others' parse rules, so I just put it there and it worked '^^ Take it out and see what happens lol |
Gordon 29-Jun-2006 [1108] | :) |
Izkata 29-Jun-2006 [1109] | I think it was like get-word or something |
BrianH 29-Jun-2006 [1110x3] | ; Did you try this? data: read/lines to-file Readfile fields: [note category flag] foreach x data [ set fields parse x "," ; do something ] |
In particular, remember not to use parse/all | |
>> parse {"Hello, World", "Blah"} "," == ["Hello, World" "Blah"] | |
Gordon 29-Jun-2006 [1113] | Hi BrianH; Yes I did try that and the problem was that even though I specified the "," as the delimiter, it came across an embedded quote #"^"" and split the input at the quote. Rebol Shouldn't have split it up that way, to my understanding. I will post some simple data to test. |
BrianH 29-Jun-2006 [1114x2] | Embedded quotes should be escaped somehow. |
Please include some troublesome data if you could. | |
Gordon 29-Jun-2006 [1116] | This data was exported by PalmOS. I like the Palm desktop for keeping track on notes/,memos addresses but the search engine sucks badly. Therefore I wanted to export the data to allow a nice Rebol search on it.. Therefore, the PalmOS export function does "escape" an embedded quote by quoting it again. Ex: Press the "Home" button becomes Press the Home button. |
Tomc 29-Jun-2006 [1117] | truth (as far as i know) is: word is is a shortcut for :word but there are a few places such as inside parse where the shortcut does not work so you need to make it explicit |
Gordon 29-Jun-2006 [1118x4] | I will get some troubleshooting data posted in a minute. |
Tomc: Do I understand that :word would be like "get word" except in a parse sentence? | |
Wait I said that wrong | |
Tomc: Do I understand that :word would be like "get word" and needed in a parse sentence but you can just use the shortcut 'word' most everywhere else? | |
BrianH 29-Jun-2006 [1122] | The colon before the word prevents the interpreter from evaluating active values like functions and parens. It's a safety thing. |
Tomc 29-Jun-2006 [1123x3] | works for me |
you can use :word everywhere you would use word | |
as far as i know | |
BrianH 29-Jun-2006 [1126] | Except when you want an active value assigned to the word to be evaluated, like when you are calling a function. |
Tomc 29-Jun-2006 [1127] | and that would be get 'word not get word |
BrianH 29-Jun-2006 [1128] | In parse rules the :word means something different (not in the code blocks in parens). |
Gordon 29-Jun-2006 [1129] | Thanks Tomc and BrianH. I'll chew on it for a while. Meanwhile I'm working on building some test data for the first problem. |
Tomc 29-Jun-2006 [1130] | in the pars rume (not the paren) it means "be here now" |
Gordon 29-Jun-2006 [1131x2] | okay so in the parse rules (except in a parenthesized code block) it means "be here now"? |
but what does 'be here now' mean? | |
older newer | first last |