World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Brett 12-Mar-2005 [44x2] | Hi Graham. Line ending in Internet protocols = Apples; Line/Paragraph representation in text files = Oranges. :-) Best not to compare them for this task. As far as I've seen, Internet protocols have an on-the-wire line ending of CRLF. So yes this is very likely a problem with your parse rules. |
With the port in binary mode - you must use CRLF to identify/transmit line endings. | |
Graham 12-Mar-2005 [46] | Hi Brett. Do you then see a problem with me parse rules? |
Tomc 12-Mar-2005 [47] | seems link to get all possible raw line endings something like copy line thru [ 1 2 ["^J" | "^M"]] might work |
Graham 12-Mar-2005 [48x2] | my one-line-rule seems to work already .. |
copy header thru [ 2 [ "^J" | "^M"] ] ... ? | |
Tomc 12-Mar-2005 [50] | unix would be a single "^J" |
Graham 12-Mar-2005 [51x2] | header is separated from body by one blank line. |
So, I need to look for two consecutive line endings to find the end of the header | |
Tomc 12-Mar-2005 [53] | yes , my rule was for a single lin as in your first code sample |
Graham 12-Mar-2005 [54] | is ^J the line feed ? |
Tomc 12-Mar-2005 [55x2] | copy header ti ["^J^J" | "^M^M" | "^M^J^M^J"] |
^J same as ^/ | |
Graham 12-Mar-2005 [57] | >> header-rule: [copy header thru ["^J^J" | "^M^M" | "^M^J^M^J"] (write-client header)] == [copy header thru ["^/^/" | "^M^M" | {^M ^M }] (write-client header)] >> parse m header-rule ** Script Error: Invalid argument: | ** Near: parse m header-rule |
Tomc 12-Mar-2005 [58x2] | try {^M^/^M^/} |
not that it shoulf matter | |
Graham 12-Mar-2005 [60] | same problem |
Tomc 12-Mar-2005 [61x3] | it is interperting the third newline ... odd buggy odd |
parse all? | |
os use | 2 CRLF | |
Graham 12-Mar-2005 [64] | >> unix: [ copy header thru "^J^J" ] == [copy header thru "^/^/"] >> msdos: [ copy header thru "^/^/" ] == [copy header thru "^/^/"] >> parse m [ [ unix | msdos ] ( write-client header ) ] |
Tomc 12-Mar-2005 [65] | ms dos not correct |
Graham 12-Mar-2005 [66] | is that correct rule for unix ? |
Tomc 12-Mar-2005 [67] | yes |
Graham 12-Mar-2005 [68] | what's msdos ? |
Tomc 12-Mar-2005 [69x3] | mac ^M^M dod ^M^J unix ^J |
mac is a single ^M for a single line | |
[2 "^/" | 2 "^M" | 2 CLRF] | |
Graham 12-Mar-2005 [72x2] | >> parse m [ [2 "^/" | 2 "^M" | 2 crlf ] ( write-client header ) ] == false |
dos is : join crlf crlf .. isn't it ? | |
Tomc 12-Mar-2005 [74] | ... copy haader theu ... |
Graham 12-Mar-2005 [75] | parse m [ copy header thru [2 "^/" | 2 "^M" | 2 crlf ] ( write-client header ) ] ** Script Error: Invalid argument: 2 | 2 crlf |
Tomc 12-Mar-2005 [76x2] | copy header [thru 2 "^/" | thru 1 "^M" | thru 2 crlf] |
it annoys me the to/thru does not distribure over the block of ORs | |
Graham 12-Mar-2005 [78] | me too ... |
Tomc 12-Mar-2005 [79] | off to the coast for the weekend, got to pack |
Graham 12-Mar-2005 [80x3] | ok. |
This rule seems to now work for me ... header-rule: [ [ copy header thru {^M^J^M^J} | copy header thru {^/^/} ] (write-client header )] | |
not sure if the second rule is needed ... | |
Chris 12-Mar-2005 [83x2] | ;I tend to use charsets as a way of skipping 'thru while looking for multiple possibilities: line-end: charset [#"^/" #"^M"] ; etc line: complement newlines parse m [copy header [some [some line line-end]]] |
a way of skipping -> "an alternative to" | |
Graham 12-Mar-2005 [85] | is it faster? |
Chris 12-Mar-2005 [86] | It's hard to compare when 'thru doesn't fork. |
Graham 12-Mar-2005 [87] | or is it slower ? |
Chris 12-Mar-2005 [88] | For what it's worth, my benchmarks show it to be quick, but my benchmarks tend to be crude... |
Graham 12-Mar-2005 [89] | line: complement line-end |
Chris 12-Mar-2005 [90] | That means any character that isn't a line-ending... |
Graham 12-Mar-2005 [91] | yes, you have comlement newlines |
Chris 12-Mar-2005 [92x2] | Sorry, changing names as I go :o) |
line-end: charset [#"^/" #"^M" #"^J"] line: complement line-end parse m [copy header [some [some line line-end]] to end] ; Note that 'line-end in the parse line should be replaced with permutations of what a line-ending can be, without describing any permutation of a double line-ending. | |
older newer | first last |