r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
5-Nov-2005
[768]
Anton, I used to use a character rather than a string too, because 
of the memory issue. But it turned out to be slower that way. I think 
parse only matches on strings, and single characters have to be converted 
to one-character strings before they can be passed to the matcher. 
At least that would explain the speed discrepancy.
Romano
5-Nov-2005
[769x2]
Anton, for me it is a wrong behaviour of parse.
I also have some doubts about the corretness of this:
var: 123 probe parse/all "" [copy var ""] var; false == 123
sqlab
7-Nov-2005
[771]
Graham: two points of trouble
1; you loose the order of the segments with your approach, 

eg NTE can be at different positions in the message, set-id is not 
always used in the same way.

2; are you sure that the values of different laboratories can be 
compared? 
are they using the same test kits, do they do ring tests ?
JaimeVargas
7-Nov-2005
[772]
I would agree with romano. Anton please reported this inconsistent 
behaviour to RAMBO.
Graham
7-Nov-2005
[773x2]
sqlab, doesn't my last parse account for the possibility of NET segment 
being anywhere in the message ?
currently I only have data from one laboratory - I'll have to contact 
some others to get theirs to see what their messages are like.
sqlab
8-Nov-2005
[775]
Yes, you get the NTEs, but you do not retain the information, which 
OBR,  which OBX  they comment or if the follow even the PID.


That's why I use just one block for all segments retaining the order.
Graham
8-Nov-2005
[776x2]
Perhaps I can get around that by instantiating a new hl7object with 
each MSH segment...
Or not ...
Anton
15-Nov-2005
[778x3]
Brian, yes! You remind me of that finding, long ago... I now remember 
thinking how it was good because of less syntax.
Romano, I don't think it is an error anymore. Note my examples contain 
the pipe |, which allows the rule to succeed on none input. eg, these 
rules are equivalent:
[copy var "a" | none]
[copy var "a" | ]
Your example, however, looks like an edge case... Probably parse 
checks for the end of the input before trying to match a string.
Geomol
1-Dec-2005
[781x2]
The 'to position' thing, we talk about in the RAMBO group also works 
with blocks:
>> parse ["string" 123 [me-:-server-:-dagobah]] [to 2 mk: (print mk)]
123 [me-:-server-:-dagobah]
== false


TO also works with datatypes, as explained in the Core documentation:

>> parse ["string" 123 [me-:-server-:-dagobah]] [to email! mk: (print mk)]
[me-:-server-:-dagobah]
== false
(Right brackets are left out just after the email address by Altme.)
Henrik
1-Dec-2005
[783]
geomol, out of curiosity, have you tried those incidents where parse 
will lock up and you need to quit REBOL? Have you tried to keep it 
running for a few minutes and see what happens?
Geomol
1-Dec-2005
[784]
nope
Henrik
1-Dec-2005
[785]
just wondering if that is a known behaviour. it returns to the console 
on its own
Geomol
1-Dec-2005
[786]
Strange and funny. Like it has it's own personality. ;-)
Henrik
1-Dec-2005
[787x2]
after about 5-7 minutes or so...
it's in one of the examples in the wikibook group
Geomol
1-Dec-2005
[789x2]
It an almost infinite loop then. Takes all the cpu.
Yes, it reply == false after some time here too.
Henrik
1-Dec-2005
[791x2]
good, then I'm not crazy :-)
wondering if there is some sanity limit on a billion loops or something
Geomol
1-Dec-2005
[793]
:-) Maybe integer overflow?
Henrik
1-Dec-2005
[794]
possibly if there is something that needs counting...
Geomol
1-Dec-2005
[795x3]
Anyway, parse works in a certain (and you could say special) way, 
when dealing with strings:
>> parse "a" [char!]
== false
>> parse "a" [string!]
== false
>> parse "a" [#"a"]
== true
>> parse "a" ["a"]
== true
So parsing a string for [any string!] maybe doesn't make much sense.
And parsing strings and chars within blocks give more meaning, when 
looking for datatypes:
>> parse ["a"] [char!]
== false
>> parse ["a"] [string!]
== true
>> parse [#"a"] [string!]
== false
>> parse [#"a"] [char!]
== true
Henrik
1-Dec-2005
[798]
using datatypes as rules, I think only work with blocks, not strings, 
which is why it returns false in the first two cases with "a"
Chris
1-Dec-2005
[799x2]
Sadly, in string mode you can't use -- to charset!
Be great if you could:
>> chars: charset "ab"
== make bitset! 64#{AAAAAAAAAAAAAAAABgAAAAAAAAAAAAAAAAAAAAAAAAA=}
>> parse "1234ab" [to chars]

** Script Error: Invalid argument: make bitset! 64#{AAAAAAAAAAAAAAAABgAAAAAAAAAAAAAAAAAAAAAAAAA=}
** Near: parse "1234ab" [to chars]
Geomol
1-Dec-2005
[801]
You can with a little trick:

>> no-chars: complement charset "ab"
== make bitset! #{
FFFFFFFFFFFFFFFFFFFFFFFFF9FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
>> parse "1234ab" [some no-chars mk: (print mk) to end]
ab
== true
Chris
1-Dec-2005
[802]
Yep, that's the workaround I use, but it can get complex...
Anton
2-Dec-2005
[803]
I seem to remember Carl S. saying something about adding a safety 
check to parse to jump out of infinite loops...
Joe
2-Jan-2006
[804x3]
Hi, I am checking the example in the parse chapter, section 7.3 copying 
the output , but I don't get it !!
parse {<h1>title</h1>}  [copy heading ["H" [ "1"  |  "2"  |  "3" 
]] (print heading)]
the text seems to imply that heading should contain the "title" string, 
but it doesn't work for me. any ideas ?
Geomol
2-Jan-2006
[807]
It's not a good example. You could do something like:


>> parse {<H1>A heading</H1>} [copy heading ["<H" ["1" | "2" | "3"] 
">" to "</"]]
== false
>> heading
== "<H1>A heading"
Joe
2-Jan-2006
[808x2]
yes,  this makes sense but it doesn't handle opening and closing 
tags as the example seems to imply. I'll report to rambo so the docs 
get updated
thanks
Ammon
8-Jan-2006
[810x2]
Something I will never understand about parse:  

digit: charset [#"0" - #"9"]
parse "123" [ any [ digit | end ] ]

I hit the end of the damn string, SO QUIT!  *&(%@$(*&($#*%
NM...  I got it.  Parse is tricksy.
Henrik
8-Jan-2006
[812]
solution?
Ammon
8-Jan-2006
[813]
parse "123" [ any [ digit | thru end ] ]
Henrik
8-Jan-2006
[814x2]
yeah, each rule stop at a position, not going past it. that way you 
wouldn't be able to reach the tail of the series. THRU will get you 
to the tale
tail
Ammon
8-Jan-2006
[816]
The problem wasn't really that I wasn't so much not hitting the end 
as it was figuring out how to keep all of my data.  I'm breaking 
up some text using a semi-dynamic set of recursive rules and I kept 
loosing my last bit of data or hanging the interpreter...
Anton
8-Jan-2006
[817]
Ah, recursive rules. :) I pushed my variables onto a stack when recursing, 
then popped them off when returning.

That tends to bloat the code a fair bit. ( push-vars [a b c] recursive-rule 
pop-vars [a b c] )

so then you get to thinking to generate this code automatically with 
a make-recursive-rule function, which takes a 

parse rule, looks for recursion in it, then surrounds it with push-vars 
and pop-vars for you (kind of macro expansion).
Or I did something like that, anyway.