World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Geomol 6-Aug-2007 [2201x2] | Sorry about any confusion! :-) |
And I guess, tab can be specified in the rule-block more simple as in:: ["word1" "word2" tab "word3" newline "word4"] | |
Gabriele 6-Aug-2007 [2203] | you need parse/all to be able to parse spaces and tabs. |
[unknown: 5] 7-Aug-2007 [2204] | Thanks everyone for your posts on this |
Geomol 7-Aug-2007 [2205] | Paul, it could be good to know, if you got it to work!? |
Chris 7-Aug-2007 [2206] | G: #"^(tab)" |
Geomol 7-Aug-2007 [2207] | Probably. |
PatrickP61 20-Aug-2007 [2208x2] | Hi all, |
Are there any good references to learn PARSE? | |
Henrik 20-Aug-2007 [2210] | There is a parse page on the Wikibook. |
PatrickP61 20-Aug-2007 [2211] | Found it -- Thanks |
Geomol 20-Aug-2007 [2212] | Also this about parsing: http://www.rebol.com/docs/core23/rebolcore-15.html |
[unknown: 5] 24-Aug-2007 [2213] | parse/all and used "^-" for tab |
[unknown: 5] 31-Aug-2007 [2214x2] | Ok ran into an issue. Is there an easy way to parse a string that has doublequotes in it together. Such as {some chars "" some more chars"" and more} |
I need the quotes to be single just one set and not two together and the parse to keep intact the string section because often it is a part of an html tag. | |
Robert 1-Sep-2007 [2216x2] | Paul, do a search & replace upfront. Much simpler than to create complex parse rules. |
I often use this pattern. Do some basic action on the parse input, parse the first round, again do some other processing than using parse again. Much simpler and faster to get where you want to go. | |
Tomc 1-Sep-2007 [2218] | paul what rule are you using for your parse |
[unknown: 5] 1-Sep-2007 [2219x3] | Thanks Robert, I'll look into that further as I did place with replace but because they were quotes it seemed that parse/all still wanted to break apart at a quote even though I told it only tabs. |
Tom, for parse I only want to parse/all data tab. Problem is that parse will break apart html tags and more. I don't want to parse out tags because they will be needed to be left intact to some extent. | |
It just seems to me that parse/all data tab doesn't ONLY parse out the tabs but breaks at these doublequotes together. | |
Tomc 2-Sep-2007 [2222x3] | Paul how are you defining tab? it seems to work for me. |
str == {some chars "" some more chars"" and more} >> parse/all str "^-" == [{some chars "" some more chars"" and more}] | |
there are no beraks at the double quotes. | |
[unknown: 5] 2-Sep-2007 [2225x11] | It looks like it breaks on html tags that might be broken. For example, I was testing parse on a tab deliminated file and performing the following parse: |
parse data "^-" | |
The problem is that some of these broken html tags cause parse to not work correctly. The tags will contain double quotes (The result of an export from oscommerce). | |
sorry i was using parse/all data "^-" | |
Doesn't even have to be broken tags it appears | |
Just when a quote is preceeding the tag | |
data: {my string^-"<span style="font: 12px arial;>some text</span>"} | |
>> parse/all data "^-" == ["my string" "<span style=" {font: 12px arial;>some text</span>"}] | |
Notice you get it breaking the string even where there is NOT a tab. | |
Is this a bug? | |
I've looked at this some more and it only seems to be a problem if the quote is preceeding the <span> tag. If you move the quote around you get what is expected and get the correct expected parsing. | |
btiffin 2-Sep-2007 [2236x2] | Your example still doesn't seem to jive with the documentation. Reading the docs, I would expected two strings in the output block. "my string" and the rest, in braces. It has something to do with a double quote starting a parse sequence. {"abc"def} parses as ["abc" "def"] { "abc"def"} parses as a single string as expected [{ "abc"def}] |
typos; expected = expect second example was supposed to be { "abc"def} The space after the brace seems to trigger different behaviour than {" with no space after the brace. Any character actually, the bad behaviour is only with brace immediately followed by double quote. | |
[unknown: 5] 3-Sep-2007 [2238] | btiffin use this example: data: {my string^-"<span style="font: 12px arial;>some text</span>"} |
btiffin 3-Sep-2007 [2239] | Yeah, I think the weird parsing behavior is due to the fact that the tab seperator is followed immediately by a token that begins with double quote. If you change the data to ... -^ "<span... (note the space after the tab), the behaviour changes. giving >> parse/all data2 to string! tab == ["my string" { "<span style="font: 12px arial;>some text</span>"}] As I would expect. You've uncovered something here. parse seems dependent on quote as the first symbol in a token. |
[unknown: 5] 3-Sep-2007 [2240x2] | Yeah but inserting the extra space is a crude workaround that still requires extra processing to then remove the space that was added. You think this is a bug with parse? |
I would add it to Rambo but not sure if it is one just yet. | |
RobertS 3-Sep-2007 [2242] | is there something I could test on 2.7.5 ? |
[unknown: 5] 3-Sep-2007 [2243x2] | sure test this: |
parse/all data: {my string^-"<span style="font: 12px arial;>some text</span>"} "^-" | |
btiffin 3-Sep-2007 [2245] | Paul; re; Inserting extra space... No sorry, didn't mean to imply that. Just pointing out that you've discovered a bug; afaik. |
[unknown: 5] 3-Sep-2007 [2246] | Yeah I message Gabriel - want him to take a look at it. |
Gabriele 4-Sep-2007 [2247] | it's not a bug - parse without a rule is meant for csv parsing, and quotes delimit a field. it's not as useful as it was intended to be, but it's intentional behavior. you need to provide your own rule if you don't want quotes to be parsed. |
btiffin 4-Sep-2007 [2248] | Umm, that's not quite what is happening here. imho. parse/all {"abc"def} to string! tab should return [{"abc"def}] should it not? ["abc" {def}] seems wrong. parse/all { "abc"def} to string! tab returns [{ "abc"def"}] as expected. The quote being in the first postion effects the parse behaviour that much? |
[unknown: 5] 4-Sep-2007 [2249] | Yeah it definately seems like odd behavior to me. Also, isn't the TAB string the rule? Maybe, I don't get what your saying Gabriele. |
PeterWood 4-Sep-2007 [2250] | Paul: I don't think the TAB string counts as a rule. It is a parameter supplying a specified delimiter when using parse for splitting strings (paraphrasing the User Guide). |
older newer | first last |