World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
BrianH 4-Nov-2005 [703]	Oh wait, the error in his parse statement isn't because of escaping. It's the thru bitset that isn't supported. Try this: caret: charset "^^" non-caret: complement caret parse obx ["OBX" "\|" digits "\|" "ST" "\|" any non-caret caret to end ]
Graham 4-Nov-2005 [704]	does that return true ?
Volker 4-Nov-2005 [705]	should work like thru. "any non-caret" is like "to caret", then skipping it is like "thru caret". without caret it would go to end, and then the final skip fail.
Graham 4-Nov-2005 [706]	>> obx: {OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/} == "OBX\|1\|ST\|Hb Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/" >> caret: charset "^^" == make bitset! #{ 0000000000000000000000400000000000000000000000000000000000000000 } >> non-caret: complement caret == make bitset! #{ FFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF } >> parse obx [ "OBX" copy test some non-caret caret to end ] == false >> test == "\|1\|ST\|Hb Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/" >>
Volker 4-Nov-2005 [707x3]	Yes, so it works
	remember the "^" in your string are no "^", else they would be molded as "^^"
	i would really get the data from outside. put them in a file and use obx: read %data.txt, or use clipboard.
Graham 4-Nov-2005 [710]	ok, I'll give that a go later on.
Volker 4-Nov-2005 [711x2]	obx: {OBX\|1\|ST\|Hb^^ Hb:^^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^^/}
Volker 4-Nov-2005 [711x2]	(or copy from altme ;)
Graham 4-Nov-2005 [713]	obx: {OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/}
Volker 4-Nov-2005 [714]	Hu? Does altme escape too? there should be duble "^^" everywhere.
Graham 4-Nov-2005 [715x2]	>> parse read clipboard:// [ copy test some non-caret caret to end ] == true >> test == "obx: {OBX\|1\|ST\|Hb"
Graham 4-Nov-2005 [715x2]	ok so that works.
BrianH 4-Nov-2005 [717]	Of course since you are looking for one character, you can use to or thru like this: parse read clipboard:// [ copy test thru "^^" to end ]
Volker 4-Nov-2005 [718]	Meoww, thats to simple for me.. good catch :)
Graham 4-Nov-2005 [719x3]	well, if RT want to use Rebol in medical applications, they should look at making it easier to work with medical data !
	OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/
	Just using Altme as a clipboard.
BrianH 4-Nov-2005 [722]	Is there a standard we can read to get the syntax?
Graham 4-Nov-2005 [723]	HL7 - yeah, hundreds of pages of specs.
BrianH 4-Nov-2005 [724]	Aaak.
Graham 4-Nov-2005 [725]	http://www.hl7.org/
BrianH 4-Nov-2005 [726]	For instance, how many fields does that data you posted have? Are they seperated by \| or is it a length thing?
Graham 4-Nov-2005 [727]	I don't know .. I am just looking at sample data and trying to reverse engineer the format as I don't have time to read 100s of pages of specs.
BrianH 4-Nov-2005 [728]	What does the ^ mean in context?
Graham 4-Nov-2005 [729x2]	but each OBX record is one blood result
Graham 4-Nov-2005 [729x2]	It seems to be a delimiter to divide a record into parts
BrianH 4-Nov-2005 [731]	Really? By the format it looks like they are using \| for that.
Graham 4-Nov-2005 [732x3]	so, \| separates fields, and ^ sub divides a field
	OBR\|1\|3CHI\|05-556701-MHA-0^VDL\|MHA^MASTER HAEM PANEL^L\|R\|200511021006\|200511021006\|""\|""\|\|\|\|\|200511021006\|\|10761^CHIU&G\|\|\|10761^CHIU&G\|10761^CHIU&G\|3CHI^chiu\|200511021152\|\|\|F OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F OBX\|2\|ST\|pV^ PCV:^L\|\|0.397\|\|0.340 - 0.470\|N\|\|\|F OBX\|3\|ST\|mV^ MCV:^L\|\|95\|fL\|81 - 97\|N\|\|\|F OBX\|4\|ST\|mh^ MCH:^L\|\|32.3\|pg\|26.5 - 33.0\|N\|\|\|F OBX\|5\|ST\|pl^ Platelets:^L\|\|224\|x 109/L\|150 - 450\|N\|\|\|F OBX\|6\|ST\|es^ ESR:^L\|\|23\|mm/hr\|1 - 27\|N\|\|\|F OBX\|7\|ST\|wc^ WCC:^L\|\|7.7\|x 109/L\|3.8 - 10.0\|N\|\|\|F OBX\|8\|ST\|Nt^ Neutrophils:^L\|\|4.5\|x109/L\|1.9 - 7.1\|N\|\|\|F OBX\|9\|ST\|Ly^ Lymphocytes:^L\|\|2.6\|x109/L\|0.6 - 3.6\|N\|\|\|F OBX\|10\|ST\|Mo^ Monocytes:^L\|\|0.5\|x109/L\|0.2 - 1.0\|N\|\|\|F OBX\|11\|ST\|Eo^ Eosinophils:^L\|\|0.1\|x109/L\|< 0.6\|N\|\|\|F OBX\|12\|ST\|Ba^ Basophils:^L\|\|0.05\|x10*9/L\|0.00 - 0.10\|N\|\|\|F OBX\|13\|FT\|bf^Comments^L\|\|COMMENT: RBC parameters normochromic normocytic.\|\|\|N\|\|\|F NTE\|1\|L\|CC Drs: MALIK, CHIU.
	I've omitted the MSH MSA and PID lines which identify the patient.
BrianH 4-Nov-2005 [735x2]	So the first field specifies the record format, the number of fields and such. The other fields are data.
BrianH 4-Nov-2005 [735x2]	Thanks for that by the way, I'd rather not know.
Graham 4-Nov-2005 [737x2]	no, I think the numbers indicate a sequence
Graham 4-Nov-2005 [737x2]	so, there are 13 OBX records for the OBR result.
BrianH 4-Nov-2005 [739]	I mean OBX is the record type, and OBX records have 11 additional fields to them.
Graham 4-Nov-2005 [740x3]	Yes, I think so.
	I presume ST stands for sub test.
	The HL7 org want to move to using XML instead ...
BrianH 4-Nov-2005 [743]	Do you want to do a full rule-based parse here, or will simple parse do? data: read/lines %data foreach rec data [ rec: parse/all rec "\|" switch rec/1 [ "OBX" [ ... do stuff...
sqlab 4-Nov-2005 [744]	OBX is the segment type segments are separated by #"^M" an OBX segment can have up to 24 fields according version 2.4, empty fields at the end of an segment need not to be transferred, fields are delimited by #"\|" normally, but all delimiters except segment delimiter can be defined for each message. fields can be divided by #"^^" into components, components can be divided into subcomponents etc.
Graham 4-Nov-2005 [745x5]	I'm working on a full rule based parse.
	sqlab, you've been doing this stuff for years.
	Ok, my parser is able to get all the data out of all the records now in the test result above.
	pipe: charset "\|" nonpipe: complement charset "\|" caret: charset "^^" non-caret: complement caret digits: charset [ #"0" - #"9" ] labsupplier: hl7level: datetime: patient: labno: labno2: none nte-rule: [ "NTE" pipe digits pipe 1 skip pipe copy notes to newline (append txt notes) ] oru-rule: [ "ORU" pipe copy labno some digits pipe ] datetime-rule: [ copy datetime some digits ] msh-rule: [ "MSH" pipe some nonpipe pipe copy labsupplier some nonpipe pipe some nonpipe pipe copy hl7level some nonpipe 2 skip datetime-rule 2 skip oru-rule thru newline ] msa-rule: [ "MSA" pipe 3 skip copy labno2 some digits thru newline ] pid-rule: [ "PID" 2 skip some nonpipe pipe some digits pipe some nonpipe pipe copy patient some nonpipe 2 pipe copy dob some digits pipe copy gender [ #"F" \| #"M" ] thru newline ] obr-rule: [ "OBR" pipe 1 digits pipe copy drcode some nonpipe pipe some nonpipe pipe copy panelcode some nonpipe pipe 1 skip pipe copy bleeddate some digits pipe copy reportdate some digits pipe 2 skip pipe 2 skip 5 pipe copy bleeddate some digits 2 pipe copy requestdr some nonpipe 3 pipe copy nzcouncilcode some nonpipe thru newline ] txt: copy "" cnt: 1 obx-rule: [ "OBX" pipe copy cntr some digits ( if cnt <> to-integer cntr [ print "halted as out of sequence" halt] cnt: cnt + 1 ) pipe [ st-rule \| ft-rule ] ] ft-rule: [ "FT" pipe some non-caret any caret copy comm some non-caret any caret 3 skip copy comments some nonpipe (repend txt [ comm " " comments newline ]) thru newline ] st-rule: [ "ST" pipe some non-caret any caret copy testtype some non-caret any caret 3 skip copy testresult some nonpipe pipe [ pipe \| copy units some nonpipe pipe ] copy range some nonpipe thru newline ( repend txt [ testtype " " testresult " " units " " range newline ] ) ] record-rule: [ ( cnt: 1 txt: copy "" ) msh-rule msa-rule pid-rule obr-rule obx-rule [ some obx-rule ] nte-rule ] parse read %hl7data.txt record-rule print [ labsupplier hl7level datetime patient labno labno2 dob gender panelcode bleeddate reportdate requestdr nzcouncilcode newline txt ]
	That's my rough working parser.
sqlab 4-Nov-2005 [750]	I have seen just too many exceptions in real messages from the rules. So I just parse the message into an internal structure like mssg: [MSH [ field1 field2 ..] PID [field1 field2 .. ..] OBR [.. ..] .. ] etc Then I can access the data either with mssg/OBX/3 for example or use set . I use checking as a optional second step.
Graham 4-Nov-2005 [751x2]	I used to parse HL7 messages differently ... splitting them into fields as well. But this time I thought I 'd try a rule based approach.
Graham 4-Nov-2005 [751x2]	I admit it is likely to be easier to do it your way.
older newer	first last