World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Graham 4-Nov-2005 [695x3]	the data is medical HL7 formatted data.
	It's being read in from a file.
	so, it shouldn't be escaped in the data itself.
BrianH 4-Nov-2005 [698]	So if you copy your test data to the clipboard, you would assign it to a variable like this: obx: read clipboard:// If you are reading it from a file with the read or open functions, there is no escaping.
Volker 4-Nov-2005 [699]	then it should be no problem. only when load sees it in strings. if you 'read it, no escaping is needed. and mold auto-escapes.
Graham 4-Nov-2005 [700]	Ok, I am testing on the command line. May be that is the problem.
BrianH 4-Nov-2005 [701x3]	Try reading from the clipboard directly like I suggested for that.
	Note: The form native doesn't escape on output like mold does.
	Oh wait, the error in his parse statement isn't because of escaping. It's the thru bitset that isn't supported. Try this: caret: charset "^^" non-caret: complement caret parse obx ["OBX" "\|" digits "\|" "ST" "\|" any non-caret caret to end ]
Graham 4-Nov-2005 [704]	does that return true ?
Volker 4-Nov-2005 [705]	should work like thru. "any non-caret" is like "to caret", then skipping it is like "thru caret". without caret it would go to end, and then the final skip fail.
Graham 4-Nov-2005 [706]	>> obx: {OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/} == "OBX\|1\|ST\|Hb Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/" >> caret: charset "^^" == make bitset! #{ 0000000000000000000000400000000000000000000000000000000000000000 } >> non-caret: complement caret == make bitset! #{ FFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF } >> parse obx [ "OBX" copy test some non-caret caret to end ] == false >> test == "\|1\|ST\|Hb Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/" >>
Volker 4-Nov-2005 [707x3]	Yes, so it works
	remember the "^" in your string are no "^", else they would be molded as "^^"
	i would really get the data from outside. put them in a file and use obx: read %data.txt, or use clipboard.
Graham 4-Nov-2005 [710]	ok, I'll give that a go later on.
Volker 4-Nov-2005 [711x2]	obx: {OBX\|1\|ST\|Hb^^ Hb:^^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^^/}
Volker 4-Nov-2005 [711x2]	(or copy from altme ;)
Graham 4-Nov-2005 [713]	obx: {OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/}
Volker 4-Nov-2005 [714]	Hu? Does altme escape too? there should be duble "^^" everywhere.
Graham 4-Nov-2005 [715x2]	>> parse read clipboard:// [ copy test some non-caret caret to end ] == true >> test == "obx: {OBX\|1\|ST\|Hb"
Graham 4-Nov-2005 [715x2]	ok so that works.
BrianH 4-Nov-2005 [717]	Of course since you are looking for one character, you can use to or thru like this: parse read clipboard:// [ copy test thru "^^" to end ]
Volker 4-Nov-2005 [718]	Meoww, thats to simple for me.. good catch :)
Graham 4-Nov-2005 [719x3]	well, if RT want to use Rebol in medical applications, they should look at making it easier to work with medical data !
	OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/
	Just using Altme as a clipboard.
BrianH 4-Nov-2005 [722]	Is there a standard we can read to get the syntax?
Graham 4-Nov-2005 [723]	HL7 - yeah, hundreds of pages of specs.
BrianH 4-Nov-2005 [724]	Aaak.
Graham 4-Nov-2005 [725]	http://www.hl7.org/
BrianH 4-Nov-2005 [726]	For instance, how many fields does that data you posted have? Are they seperated by \| or is it a length thing?
Graham 4-Nov-2005 [727]	I don't know .. I am just looking at sample data and trying to reverse engineer the format as I don't have time to read 100s of pages of specs.
BrianH 4-Nov-2005 [728]	What does the ^ mean in context?
Graham 4-Nov-2005 [729x2]	but each OBX record is one blood result
Graham 4-Nov-2005 [729x2]	It seems to be a delimiter to divide a record into parts
BrianH 4-Nov-2005 [731]	Really? By the format it looks like they are using \| for that.
Graham 4-Nov-2005 [732x3]	so, \| separates fields, and ^ sub divides a field
	OBR\|1\|3CHI\|05-556701-MHA-0^VDL\|MHA^MASTER HAEM PANEL^L\|R\|200511021006\|200511021006\|""\|""\|\|\|\|\|200511021006\|\|10761^CHIU&G\|\|\|10761^CHIU&G\|10761^CHIU&G\|3CHI^chiu\|200511021152\|\|\|F OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F OBX\|2\|ST\|pV^ PCV:^L\|\|0.397\|\|0.340 - 0.470\|N\|\|\|F OBX\|3\|ST\|mV^ MCV:^L\|\|95\|fL\|81 - 97\|N\|\|\|F OBX\|4\|ST\|mh^ MCH:^L\|\|32.3\|pg\|26.5 - 33.0\|N\|\|\|F OBX\|5\|ST\|pl^ Platelets:^L\|\|224\|x 109/L\|150 - 450\|N\|\|\|F OBX\|6\|ST\|es^ ESR:^L\|\|23\|mm/hr\|1 - 27\|N\|\|\|F OBX\|7\|ST\|wc^ WCC:^L\|\|7.7\|x 109/L\|3.8 - 10.0\|N\|\|\|F OBX\|8\|ST\|Nt^ Neutrophils:^L\|\|4.5\|x109/L\|1.9 - 7.1\|N\|\|\|F OBX\|9\|ST\|Ly^ Lymphocytes:^L\|\|2.6\|x109/L\|0.6 - 3.6\|N\|\|\|F OBX\|10\|ST\|Mo^ Monocytes:^L\|\|0.5\|x109/L\|0.2 - 1.0\|N\|\|\|F OBX\|11\|ST\|Eo^ Eosinophils:^L\|\|0.1\|x109/L\|< 0.6\|N\|\|\|F OBX\|12\|ST\|Ba^ Basophils:^L\|\|0.05\|x10*9/L\|0.00 - 0.10\|N\|\|\|F OBX\|13\|FT\|bf^Comments^L\|\|COMMENT: RBC parameters normochromic normocytic.\|\|\|N\|\|\|F NTE\|1\|L\|CC Drs: MALIK, CHIU.
	I've omitted the MSH MSA and PID lines which identify the patient.
BrianH 4-Nov-2005 [735x2]	So the first field specifies the record format, the number of fields and such. The other fields are data.
BrianH 4-Nov-2005 [735x2]	Thanks for that by the way, I'd rather not know.
Graham 4-Nov-2005 [737x2]	no, I think the numbers indicate a sequence
Graham 4-Nov-2005 [737x2]	so, there are 13 OBX records for the OBR result.
BrianH 4-Nov-2005 [739]	I mean OBX is the record type, and OBX records have 11 additional fields to them.
Graham 4-Nov-2005 [740x3]	Yes, I think so.
	I presume ST stands for sub test.
	The HL7 org want to move to using XML instead ...
BrianH 4-Nov-2005 [743]	Do you want to do a full rule-based parse here, or will simple parse do? data: read/lines %data foreach rec data [ rec: parse/all rec "\|" switch rec/1 [ "OBX" [ ... do stuff...
sqlab 4-Nov-2005 [744]	OBX is the segment type segments are separated by #"^M" an OBX segment can have up to 24 fields according version 2.4, empty fields at the end of an segment need not to be transferred, fields are delimited by #"\|" normally, but all delimiters except segment delimiter can be defined for each message. fields can be divided by #"^^" into components, components can be divided into subcomponents etc.
older newer	first last