r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

BrianH
4-Nov-2005
[689]
Because it looks like the carets in your data are registering as 
escape characters. That ^H is registering as a backspace character, 
not the sequence "^H".
Volker
4-Nov-2005
[690]
any non-caret caret  ;?
Graham
4-Nov-2005
[691x2]
It's ^spaceH
ok, parse enhancement proposal .. allow one to change the escape 
character.
BrianH
4-Nov-2005
[693]
If you loading your data like you do when you type it on the interpreter 
command line, carets are treated as escape characters. If you are 
getting it from the read command, they aren't.
Volker
4-Nov-2005
[694]
that you have to escape all the ^ to ^^ in strings is true too, but 
not the error-reason. the rule should fail, not error.
Graham
4-Nov-2005
[695x3]
the data is medical HL7 formatted data.
It's being read in from a file.
so, it shouldn't be escaped in the data itself.
BrianH
4-Nov-2005
[698]
So if you copy your test data to the clipboard, you would assign 
it to a variable like this:
obx: read clipboard://

If you are reading it from a file with the read or open functions, 
there is no escaping.
Volker
4-Nov-2005
[699]
then it should be no problem. only when load sees it in strings. 
if you 'read it, no escaping is needed. and mold auto-escapes.
Graham
4-Nov-2005
[700]
Ok, I am testing on the command line.  May be that is the problem.
BrianH
4-Nov-2005
[701x3]
Try reading from the clipboard directly like I suggested for that.
Note: The form native doesn't escape on output like mold does.
Oh wait, the error in his parse statement isn't because of escaping. 
It's the thru bitset that isn't supported. Try this:
caret: charset "^^"
non-caret: complement caret

parse obx ["OBX" "|" digits "|" "ST" "|" any non-caret caret to end 
]
Graham
4-Nov-2005
[704]
does that return true ?
Volker
4-Nov-2005
[705]
should work like thru.  "any non-caret" is like "to caret", then 
skipping it is like "thru caret". without caret it would go to end, 
and then the final skip fail.
Graham
4-Nov-2005
[706]
>> obx: {OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F^/}
== "OBX|1|ST|Hb Hb:^L||135|g/L|120 - 155|N|||F^/"
>> caret: charset "^^"
== make bitset! #{
0000000000000000000000400000000000000000000000000000000000000000
}
>> non-caret: complement caret
== make bitset! #{
FFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
}
>> parse obx [ "OBX" copy test some non-caret caret to end ]
== false
>> test
== "|1|ST|Hb Hb:^L||135|g/L|120 - 155|N|||F^/"
>>
Volker
4-Nov-2005
[707x3]
Yes, so it works
remember the "^" in your string are no "^", else they would be molded 
as "^^"
i would really get the data from outside. put them in a file and 
use obx: read %data.txt, or use clipboard.
Graham
4-Nov-2005
[710]
ok, I'll give that a go later on.
Volker
4-Nov-2005
[711x2]
obx: {OBX|1|ST|Hb^^ Hb:^^L||135|g/L|120 - 155|N|||F^^/}
(or copy from altme ;)
Graham
4-Nov-2005
[713]
obx: {OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F^/}
Volker
4-Nov-2005
[714]
Hu? Does altme escape too? there should be duble "^^" everywhere.
Graham
4-Nov-2005
[715x2]
>> parse read clipboard:// [ copy test some non-caret caret to end 
]
== true
>> test
== "obx: {OBX|1|ST|Hb"
ok so that works.
BrianH
4-Nov-2005
[717]
Of course since you are looking for one character, you can use to 
or thru like this:
parse read clipboard:// [ copy test thru "^^" to end ]
Volker
4-Nov-2005
[718]
Meoww, thats to simple for me.. good catch :)
Graham
4-Nov-2005
[719x3]
well, if RT want to use Rebol in medical applications, they should 
look at making it easier to work with medical data !
OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F^/
Just using Altme as a clipboard.
BrianH
4-Nov-2005
[722]
Is there a standard we can read to get the syntax?
Graham
4-Nov-2005
[723]
HL7 - yeah, hundreds of pages of specs.
BrianH
4-Nov-2005
[724]
Aaak.
Graham
4-Nov-2005
[725]
http://www.hl7.org/
BrianH
4-Nov-2005
[726]
For instance, how many fields does that data you posted have? Are 
they seperated by | or is it a length thing?
Graham
4-Nov-2005
[727]
I don't know .. I am just looking at sample data and trying to reverse 
engineer the format as I don't have time to read 100s of pages of 
specs.
BrianH
4-Nov-2005
[728]
What does the ^ mean in context?
Graham
4-Nov-2005
[729x2]
but each OBX record is one blood result
It seems to be a delimiter to divide a record into parts
BrianH
4-Nov-2005
[731]
Really? By the format it looks like they are using | for that.
Graham
4-Nov-2005
[732x3]
so, | separates fields, and ^ sub divides a field
OBR|1|3CHI|05-556701-MHA-0^VDL|MHA^MASTER HAEM PANEL^L|R|200511021006|200511021006|""|""|||||200511021006||10761^CHIU&G|||10761^CHIU&G|10761^CHIU&G|3CHI^chiu|200511021152|||F
OBX|1|ST|Hb^ Hb:^L||135|g/L|120 - 155|N|||F
OBX|2|ST|pV^ PCV:^L||0.397||0.340 - 0.470|N|||F
OBX|3|ST|mV^ MCV:^L||95|fL|81 - 97|N|||F
OBX|4|ST|mh^ MCH:^L||32.3|pg|26.5 - 33.0|N|||F
OBX|5|ST|pl^ Platelets:^L||224|x 10*9/L|150 - 450|N|||F
OBX|6|ST|es^ ESR:^L||23|mm/hr|1 - 27|N|||F
OBX|7|ST|wc^ WCC:^L||7.7|x 10*9/L|3.8 - 10.0|N|||F
OBX|8|ST|Nt^ Neutrophils:^L||4.5|x10*9/L|1.9 - 7.1|N|||F
OBX|9|ST|Ly^ Lymphocytes:^L||2.6|x10*9/L|0.6 - 3.6|N|||F
OBX|10|ST|Mo^ Monocytes:^L||0.5|x10*9/L|0.2 - 1.0|N|||F
OBX|11|ST|Eo^ Eosinophils:^L||0.1|x10*9/L|< 0.6|N|||F
OBX|12|ST|Ba^ Basophils:^L||0.05|x10*9/L|0.00 - 0.10|N|||F

OBX|13|FT|bf^Comments^L||COMMENT: RBC parameters normochromic normocytic.|||N|||F
NTE|1|L|CC Drs: MALIK, CHIU.
I've omitted the MSH MSA and PID lines which identify the patient.
BrianH
4-Nov-2005
[735x2]
So the first field specifies the record format, the number of fields 
and such. The other fields are data.
Thanks for that by the way, I'd rather not know.
Graham
4-Nov-2005
[737x2]
no, I think the numbers indicate a sequence
so, there are 13 OBX records for the OBR result.