r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Graham
16-May-2009
[3686x2]
If I have a document with headings eg. a: b: .. z: and text optionally 
under each heading ... would it be possible to use parse to collect 
all the text from each heading if the headings are in any order and 
some headings with no text are optionally missing?
Each heading can only occur once in the document.
Maxim
16-May-2009
[3688]
sure
Graham
16-May-2009
[3689]
Ok, let me rephrase that .. sure it's possible, but I can imagine 
it would be quite complicated
Maxim
16-May-2009
[3690x2]
now was that a question of the "can you give me the solution" kind?
actually it can be done quite simply... depends on the headers themselves...
Graham
16-May-2009
[3692]
It's a little complicated because the headers can have spaces in 
them.
Maxim
16-May-2009
[3693x2]
spaces add no complication to the system, as long as the headers 
can be identified without doubt.
so the rule is :

headers start on new line, stop at first ":" 
all the rest is content?
Graham
16-May-2009
[3695]
now if you have a rule
copy text [ to "a:" | to "b:" .... ] 

but if b: occurs before a: in the text, then you will include a header 
in copied text
Maxim
16-May-2009
[3696]
forget to and thru... they are not proper parsing.
Graham
16-May-2009
[3697]
yes, headers start on a newline and terminate in ":"
Maxim
16-May-2009
[3698]
and there can be no ":" within the content?
Graham
16-May-2009
[3699x2]
No, there can be a ":" in the content
but you know what the headers are ... so that's not a big problem.
Maxim
16-May-2009
[3701x2]
ok, so they are explicit... then its very easy.
can you give the name of some the headers... or an example.... so 
far it looks like a really simple rule to me.
Graham
16-May-2009
[3703]
eg. "social history:"
Maxim
16-May-2009
[3704x2]
and you want the output in neat blocks I guess.
give me 1 minute
Graham
16-May-2009
[3706x3]
so I guess we can masks for each possible header
^/social history:
or apply the rule recursively until it is false
Maxim
16-May-2009
[3709]
I can assume it starts at a header?
Graham
16-May-2009
[3710x2]
might be leading newlines
or white spaces
Maxim
16-May-2009
[3712]
ok, but no content or stray letters?
Graham
16-May-2009
[3713x2]
shouldn't be yet.
So, I am trying to create an object from a semi structured document 
where the object elements are in any order or missing.
Maxim
16-May-2009
[3715x3]
almost done...
ok, so we replace the spaces in the headers by "-"  and create an 
object out of all the code...
all the content... rather
Graham
16-May-2009
[3718]
I guess I can do it without using parse .. just replace all the headers 
with a mark, that allows me to split off all the sections, and then 
i can match the sections with all the section headers.
Maxim
16-May-2009
[3719]
I'm almost done... I like these little parse tests.. It keeps my 
mind sharp on using parse  ;-)
Graham
16-May-2009
[3720]
But I don't need parse!  :)
Steeve
16-May-2009
[3721]
are you asleep ? :-)
Maxim
16-May-2009
[3722]
its working but its skipping the first tag for some reason.
Graham
16-May-2009
[3723]
Huh?  just dozing ...
Maxim
16-May-2009
[3724x2]
aaahh there is no newline on the start of the text hehehe
graham, obviously the simplest solution is to read/lines.
Graham
16-May-2009
[3726]
read/lines doesn't work on text in memory AFAIK
Maxim
16-May-2009
[3727]
and just see if the line starts with one of the headers.
Steeve
16-May-2009
[3728]
what's the content look like ?
Can't you just post an example Graham ?
Maxim
16-May-2009
[3729]
parse text "^/"
Graham
16-May-2009
[3730x2]
CC:
Patient complains of sore throat.

HPI:
ONSET: Sudden, TIMING: Constant, DURATION: 3 days

INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position

CURRENT MEDICATIONS:
TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain"
cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm"

MEDICAL HISTORY:
Rheumatic heart disease, unspec. 391.9
Eczema, atopic dermatitis 691.8
dyslipidemia

ALLERGIES:
Penicillin - allergy: Allergy
Penicillin - allergy: Allergy
Penicillin - anaphylactic reaction
lovastatin - allergy: allergic
macrodantin - 1 po BID

SURGERIES:


HOSPITALIZATIONS:


FAMILY HISTORY:


SOCIAL HISTORY:


ROS:


VITALS:


EXAMINATION:
General: Appears non-toxic

HEENT: TONSILS hypertrophic, and erythematous. MOUTH buccal mucosa, 
moist. PHARYNX indurated, and angry. NOSE turbinates, with no obstuction.

Neck: NECK Supple, with no lymphadenopathy, thyromegaly, or masses.
CVS: HEART RRR s M
Chest: ANTERIOR LUNGS clear bilat


ASSESSMENTS:
391.9 Rheumatic heart disease, unspec.


TREATMENT:


PROCEDURES:


IMMUNIZATIONS:


IMAGING:


LABORATORY:


EDUCATION:
None.

REFERRALS:
Non contributory.

FOLLOWUP:


SUPERBILL:
That was sent to me today as an example
Steeve
16-May-2009
[3732]
Hmm...
Maxim
16-May-2009
[3733x3]
implementing later solution... this is easier
here you go  :-)


data: {CC:
Patient complains of sore throat.

HPI:
ONSET: Sudden, TIMING: Constant, DURATION: 3 days

INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position

CURRENT MEDICATIONS:
TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain"
cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm"

MEDICAL HISTORY:
Rheumatic heart disease, unspec. 391.9
Eczema, atopic dermatitis 691.8
dyslipidemia

ALLERGIES:
Penicillin - allergy: Allergy
Penicillin - allergy: Allergy
Penicillin - anaphylactic reaction
lovastatin - allergy: allergic
macrodantin - 1 po BID

SURGERIES:
}

data: parse/all data "^/"


header-lbl: ["CC" | "HPI" | "ONSET" | "INTENSITY" |"CURRENT MEDICATIONS" 
| "MEDICAL HISTORY" | "ALLERGIES" | "SURGERIES"]

spec: []
foreach line data [
	unless parse/all line [
		copy hdr [header-lbl ":"]
		here:
		(

   append spec to-set-word head remove back tail replace/all hdr " " 
   "-"
			append spec copy/part here tail line
		)
	][
	
		if string? item: last spec [
			append item line
		]
	]

]

probe context spec
ok for you?