World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Graham 4-Nov-2005 [753x2]	I then I have to map the results so that different laboratory's fields can be made equivalent.
Graham 4-Nov-2005 [753x2]	And the data can then be codified.
sqlab 4-Nov-2005 [755]	What do you mean with different laboratory's fields can be made equivalent. and the data can be codified?
Graham 4-Nov-2005 [756]	Rather than storing the HL7 result as free text, to store each sub test in a database. So, a Hb result will be stored as a Hb record. Another laboratory might call that "haemoglobin", so I need to map these two together.
sqlab 4-Nov-2005 [757]	Do you want to do your mapping in the database or with Rebol?
Graham 4-Nov-2005 [758x2]	in the database ... Here's my new parser using your approach.
Graham 4-Nov-2005 [758x2]	hl7msg: make object! [ msh: [] msa: [] pid: [] obr: [] obx: [] nte: [] ] datafile: %hl7data.txt parse-hl7msg: func [datafile [string!] /local segment segbl v ] [ hl7: make hl7msg [] trim/head/tail datafile append datafile {^/} line-rule: [copy segment to "^/" 1 skip ( segbl: parse/all segment "\|" either segbl/1 = "OBX" [ insert/only tail hl7/obx skip segbl 1 ] [ v: to-word segbl/1 insert hl7/:v skip segbl 1 ] ) ] parse/all datafile [ some line-rule ] hl7 ] test: parse-hl7msg read datafile
BrianH 4-Nov-2005 [760]	Again, if you are just matching a single character or a fixed string, it is better and faster to just match it instead of matching a charset of that character. You don't need the caret, non-caret, pipe and nonpipe charsets you have above - the strings "^^" and "\|" will do just as well.
Anton 5-Nov-2005 [761x7]	Graham, I agree with BrianH. It should speed up your parse, and make it easier to read because you can use TO and THRU again. caret: #"^^" etc
	(and using a character instead of a string will save a tiny bit of memory too, I think)
	I have a strange issue of my own:
	var: 123 parse/all "a" [copy var "b" \| (?? var)] ; ---> var: none var: 123 parse/all "a" [[copy var "b"] \| (?? var)] ; ---> var: 123 var: 123 rule: [copy var "b"] parse/all "a" [rule \| (?? var)] ; ---> var: 123
	I want to know what happens when COPY fails to match the input. In the first case, it modifies VAR, changing it from 123 to NONE. In the second case, it leaves VAR alone.
	Oh! I think I know what's happening !
	Yep, understand it now. It's like this: var: 1 parse "" [copy var "a" \|] ;== true var ;== none
BrianH 5-Nov-2005 [768]	Anton, I used to use a character rather than a string too, because of the memory issue. But it turned out to be slower that way. I think parse only matches on strings, and single characters have to be converted to one-character strings before they can be passed to the matcher. At least that would explain the speed discrepancy.
Romano 5-Nov-2005 [769x2]	Anton, for me it is a wrong behaviour of parse.
Romano 5-Nov-2005 [769x2]	I also have some doubts about the corretness of this: var: 123 probe parse/all "" [copy var ""] var; false == 123
sqlab 7-Nov-2005 [771]	Graham: two points of trouble 1; you loose the order of the segments with your approach, eg NTE can be at different positions in the message, set-id is not always used in the same way. 2; are you sure that the values of different laboratories can be compared? are they using the same test kits, do they do ring tests ?
JaimeVargas 7-Nov-2005 [772]	I would agree with romano. Anton please reported this inconsistent behaviour to RAMBO.
Graham 7-Nov-2005 [773x2]	sqlab, doesn't my last parse account for the possibility of NET segment being anywhere in the message ?
Graham 7-Nov-2005 [773x2]	currently I only have data from one laboratory - I'll have to contact some others to get theirs to see what their messages are like.
sqlab 8-Nov-2005 [775]	Yes, you get the NTEs, but you do not retain the information, which OBR, which OBX they comment or if the follow even the PID. That's why I use just one block for all segments retaining the order.
Graham 8-Nov-2005 [776x2]	Perhaps I can get around that by instantiating a new hl7object with each MSH segment...
Graham 8-Nov-2005 [776x2]	Or not ...
Anton 15-Nov-2005 [778x3]	Brian, yes! You remind me of that finding, long ago... I now remember thinking how it was good because of less syntax.
	Romano, I don't think it is an error anymore. Note my examples contain the pipe \|, which allows the rule to succeed on none input. eg, these rules are equivalent: [copy var "a" \| none] [copy var "a" \| ]
	Your example, however, looks like an edge case... Probably parse checks for the end of the input before trying to match a string.
Geomol 1-Dec-2005 [781x2]	The 'to position' thing, we talk about in the RAMBO group also works with blocks: >> parse ["string" 123 [me-:-server-:-dagobah]] [to 2 mk: (print mk)] 123 [me-:-server-:-dagobah] == false TO also works with datatypes, as explained in the Core documentation: >> parse ["string" 123 [me-:-server-:-dagobah]] [to email! mk: (print mk)] [me-:-server-:-dagobah] == false
Geomol 1-Dec-2005 [781x2]	(Right brackets are left out just after the email address by Altme.)
Henrik 1-Dec-2005 [783]	geomol, out of curiosity, have you tried those incidents where parse will lock up and you need to quit REBOL? Have you tried to keep it running for a few minutes and see what happens?
Geomol 1-Dec-2005 [784]	nope
Henrik 1-Dec-2005 [785]	just wondering if that is a known behaviour. it returns to the console on its own
Geomol 1-Dec-2005 [786]	Strange and funny. Like it has it's own personality. ;-)
Henrik 1-Dec-2005 [787x2]	after about 5-7 minutes or so...
Henrik 1-Dec-2005 [787x2]	it's in one of the examples in the wikibook group
Geomol 1-Dec-2005 [789x2]	It an almost infinite loop then. Takes all the cpu.
Geomol 1-Dec-2005 [789x2]	Yes, it reply == false after some time here too.
Henrik 1-Dec-2005 [791x2]	good, then I'm not crazy :-)
Henrik 1-Dec-2005 [791x2]	wondering if there is some sanity limit on a billion loops or something
Geomol 1-Dec-2005 [793]	:-) Maybe integer overflow?
Henrik 1-Dec-2005 [794]	possibly if there is something that needs counting...
Geomol 1-Dec-2005 [795x3]	Anyway, parse works in a certain (and you could say special) way, when dealing with strings: >> parse "a" [char!] == false >> parse "a" [string!] == false >> parse "a" [#"a"] == true >> parse "a" ["a"] == true
	So parsing a string for [any string!] maybe doesn't make much sense.
	And parsing strings and chars within blocks give more meaning, when looking for datatypes: >> parse ["a"] [char!] == false >> parse ["a"] [string!] == true >> parse [#"a"] [string!] == false >> parse [#"a"] [char!] == true
Henrik 1-Dec-2005 [798]	using datatypes as rules, I think only work with blocks, not strings, which is why it returns false in the first two cases with "a"
Chris 1-Dec-2005 [799x2]	Sadly, in string mode you can't use -- to charset!
Chris 1-Dec-2005 [799x2]	Be great if you could: >> chars: charset "ab" == make bitset! 64#{AAAAAAAAAAAAAAAABgAAAAAAAAAAAAAAAAAAAAAAAAA=} >> parse "1234ab" [to chars] Script Error: Invalid argument: make bitset! 64#{AAAAAAAAAAAAAAAABgAAAAAAAAAAAAAAAAAAAAAAAAA=} Near: parse "1234ab" [to chars]
Geomol 1-Dec-2005 [801]	You can with a little trick: >> no-chars: complement charset "ab" == make bitset! #{ FFFFFFFFFFFFFFFFFFFFFFFFF9FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF } >> parse "1234ab" [some no-chars mk: (print mk) to end] ab == true
Chris 1-Dec-2005 [802]	Yep, that's the workaround I use, but it can get complex...
older newer	first last