World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Volker 4-Nov-2005 [690]	any non-caret caret ;?
Graham 4-Nov-2005 [691x2]	It's ^spaceH
Graham 4-Nov-2005 [691x2]	ok, parse enhancement proposal .. allow one to change the escape character.
BrianH 4-Nov-2005 [693]	If you loading your data like you do when you type it on the interpreter command line, carets are treated as escape characters. If you are getting it from the read command, they aren't.
Volker 4-Nov-2005 [694]	that you have to escape all the ^ to ^^ in strings is true too, but not the error-reason. the rule should fail, not error.
Graham 4-Nov-2005 [695x3]	the data is medical HL7 formatted data.
	It's being read in from a file.
	so, it shouldn't be escaped in the data itself.
BrianH 4-Nov-2005 [698]	So if you copy your test data to the clipboard, you would assign it to a variable like this: obx: read clipboard:// If you are reading it from a file with the read or open functions, there is no escaping.
Volker 4-Nov-2005 [699]	then it should be no problem. only when load sees it in strings. if you 'read it, no escaping is needed. and mold auto-escapes.
Graham 4-Nov-2005 [700]	Ok, I am testing on the command line. May be that is the problem.
BrianH 4-Nov-2005 [701x3]	Try reading from the clipboard directly like I suggested for that.
	Note: The form native doesn't escape on output like mold does.
	Oh wait, the error in his parse statement isn't because of escaping. It's the thru bitset that isn't supported. Try this: caret: charset "^^" non-caret: complement caret parse obx ["OBX" "\|" digits "\|" "ST" "\|" any non-caret caret to end ]
Graham 4-Nov-2005 [704]	does that return true ?
Volker 4-Nov-2005 [705]	should work like thru. "any non-caret" is like "to caret", then skipping it is like "thru caret". without caret it would go to end, and then the final skip fail.
Graham 4-Nov-2005 [706]	>> obx: {OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/} == "OBX\|1\|ST\|Hb Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/" >> caret: charset "^^" == make bitset! #{ 0000000000000000000000400000000000000000000000000000000000000000 } >> non-caret: complement caret == make bitset! #{ FFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF } >> parse obx [ "OBX" copy test some non-caret caret to end ] == false >> test == "\|1\|ST\|Hb Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/" >>
Volker 4-Nov-2005 [707x3]	Yes, so it works
	remember the "^" in your string are no "^", else they would be molded as "^^"
	i would really get the data from outside. put them in a file and use obx: read %data.txt, or use clipboard.
Graham 4-Nov-2005 [710]	ok, I'll give that a go later on.
Volker 4-Nov-2005 [711x2]	obx: {OBX\|1\|ST\|Hb^^ Hb:^^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^^/}
Volker 4-Nov-2005 [711x2]	(or copy from altme ;)
Graham 4-Nov-2005 [713]	obx: {OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/}
Volker 4-Nov-2005 [714]	Hu? Does altme escape too? there should be duble "^^" everywhere.
Graham 4-Nov-2005 [715x2]	>> parse read clipboard:// [ copy test some non-caret caret to end ] == true >> test == "obx: {OBX\|1\|ST\|Hb"
Graham 4-Nov-2005 [715x2]	ok so that works.
BrianH 4-Nov-2005 [717]	Of course since you are looking for one character, you can use to or thru like this: parse read clipboard:// [ copy test thru "^^" to end ]
Volker 4-Nov-2005 [718]	Meoww, thats to simple for me.. good catch :)
Graham 4-Nov-2005 [719x3]	well, if RT want to use Rebol in medical applications, they should look at making it easier to work with medical data !
	OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F^/
	Just using Altme as a clipboard.
BrianH 4-Nov-2005 [722]	Is there a standard we can read to get the syntax?
Graham 4-Nov-2005 [723]	HL7 - yeah, hundreds of pages of specs.
BrianH 4-Nov-2005 [724]	Aaak.
Graham 4-Nov-2005 [725]	http://www.hl7.org/
BrianH 4-Nov-2005 [726]	For instance, how many fields does that data you posted have? Are they seperated by \| or is it a length thing?
Graham 4-Nov-2005 [727]	I don't know .. I am just looking at sample data and trying to reverse engineer the format as I don't have time to read 100s of pages of specs.
BrianH 4-Nov-2005 [728]	What does the ^ mean in context?
Graham 4-Nov-2005 [729x2]	but each OBX record is one blood result
Graham 4-Nov-2005 [729x2]	It seems to be a delimiter to divide a record into parts
BrianH 4-Nov-2005 [731]	Really? By the format it looks like they are using \| for that.
Graham 4-Nov-2005 [732x3]	so, \| separates fields, and ^ sub divides a field
	OBR\|1\|3CHI\|05-556701-MHA-0^VDL\|MHA^MASTER HAEM PANEL^L\|R\|200511021006\|200511021006\|""\|""\|\|\|\|\|200511021006\|\|10761^CHIU&G\|\|\|10761^CHIU&G\|10761^CHIU&G\|3CHI^chiu\|200511021152\|\|\|F OBX\|1\|ST\|Hb^ Hb:^L\|\|135\|g/L\|120 - 155\|N\|\|\|F OBX\|2\|ST\|pV^ PCV:^L\|\|0.397\|\|0.340 - 0.470\|N\|\|\|F OBX\|3\|ST\|mV^ MCV:^L\|\|95\|fL\|81 - 97\|N\|\|\|F OBX\|4\|ST\|mh^ MCH:^L\|\|32.3\|pg\|26.5 - 33.0\|N\|\|\|F OBX\|5\|ST\|pl^ Platelets:^L\|\|224\|x 109/L\|150 - 450\|N\|\|\|F OBX\|6\|ST\|es^ ESR:^L\|\|23\|mm/hr\|1 - 27\|N\|\|\|F OBX\|7\|ST\|wc^ WCC:^L\|\|7.7\|x 109/L\|3.8 - 10.0\|N\|\|\|F OBX\|8\|ST\|Nt^ Neutrophils:^L\|\|4.5\|x109/L\|1.9 - 7.1\|N\|\|\|F OBX\|9\|ST\|Ly^ Lymphocytes:^L\|\|2.6\|x109/L\|0.6 - 3.6\|N\|\|\|F OBX\|10\|ST\|Mo^ Monocytes:^L\|\|0.5\|x109/L\|0.2 - 1.0\|N\|\|\|F OBX\|11\|ST\|Eo^ Eosinophils:^L\|\|0.1\|x109/L\|< 0.6\|N\|\|\|F OBX\|12\|ST\|Ba^ Basophils:^L\|\|0.05\|x10*9/L\|0.00 - 0.10\|N\|\|\|F OBX\|13\|FT\|bf^Comments^L\|\|COMMENT: RBC parameters normochromic normocytic.\|\|\|N\|\|\|F NTE\|1\|L\|CC Drs: MALIK, CHIU.
	I've omitted the MSH MSA and PID lines which identify the patient.
BrianH 4-Nov-2005 [735x2]	So the first field specifies the record format, the number of fields and such. The other fields are data.
BrianH 4-Nov-2005 [735x2]	Thanks for that by the way, I'd rather not know.
Graham 4-Nov-2005 [737x2]	no, I think the numbers indicate a sequence
Graham 4-Nov-2005 [737x2]	so, there are 13 OBX records for the OBR result.
BrianH 4-Nov-2005 [739]	I mean OBX is the record type, and OBX records have 11 additional fields to them.
older newer	first last