r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Parse] Discussion of PARSE dialect

Maxim
16-May-2009
[3693x2]
spaces add no complication to the system, as long as the headers 
can be identified without doubt.
so the rule is :

headers start on new line, stop at first ":" 
all the rest is content?
Graham
16-May-2009
[3695]
now if you have a rule
copy text [ to "a:" | to "b:" .... ] 

but if b: occurs before a: in the text, then you will include a header 
in copied text
Maxim
16-May-2009
[3696]
forget to and thru... they are not proper parsing.
Graham
16-May-2009
[3697]
yes, headers start on a newline and terminate in ":"
Maxim
16-May-2009
[3698]
and there can be no ":" within the content?
Graham
16-May-2009
[3699x2]
No, there can be a ":" in the content
but you know what the headers are ... so that's not a big problem.
Maxim
16-May-2009
[3701x2]
ok, so they are explicit... then its very easy.
can you give the name of some the headers... or an example.... so 
far it looks like a really simple rule to me.
Graham
16-May-2009
[3703]
eg. "social history:"
Maxim
16-May-2009
[3704x2]
and you want the output in neat blocks I guess.
give me 1 minute
Graham
16-May-2009
[3706x3]
so I guess we can masks for each possible header
^/social history:
or apply the rule recursively until it is false
Maxim
16-May-2009
[3709]
I can assume it starts at a header?
Graham
16-May-2009
[3710x2]
might be leading newlines
or white spaces
Maxim
16-May-2009
[3712]
ok, but no content or stray letters?
Graham
16-May-2009
[3713x2]
shouldn't be yet.
So, I am trying to create an object from a semi structured document 
where the object elements are in any order or missing.
Maxim
16-May-2009
[3715x3]
almost done...
ok, so we replace the spaces in the headers by "-"  and create an 
object out of all the code...
all the content... rather
Graham
16-May-2009
[3718]
I guess I can do it without using parse .. just replace all the headers 
with a mark, that allows me to split off all the sections, and then 
i can match the sections with all the section headers.
Maxim
16-May-2009
[3719]
I'm almost done... I like these little parse tests.. It keeps my 
mind sharp on using parse  ;-)
Graham
16-May-2009
[3720]
But I don't need parse!  :)
Steeve
16-May-2009
[3721]
are you asleep ? :-)
Maxim
16-May-2009
[3722]
its working but its skipping the first tag for some reason.
Graham
16-May-2009
[3723]
Huh?  just dozing ...
Maxim
16-May-2009
[3724x2]
aaahh there is no newline on the start of the text hehehe
graham, obviously the simplest solution is to read/lines.
Graham
16-May-2009
[3726]
read/lines doesn't work on text in memory AFAIK
Maxim
16-May-2009
[3727]
and just see if the line starts with one of the headers.
Steeve
16-May-2009
[3728]
what's the content look like ?
Can't you just post an example Graham ?
Maxim
16-May-2009
[3729]
parse text "^/"
Graham
16-May-2009
[3730x2]
CC:
Patient complains of sore throat.

HPI:
ONSET: Sudden, TIMING: Constant, DURATION: 3 days

INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position

CURRENT MEDICATIONS:
TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain"
cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm"

MEDICAL HISTORY:
Rheumatic heart disease, unspec. 391.9
Eczema, atopic dermatitis 691.8
dyslipidemia

ALLERGIES:
Penicillin - allergy: Allergy
Penicillin - allergy: Allergy
Penicillin - anaphylactic reaction
lovastatin - allergy: allergic
macrodantin - 1 po BID

SURGERIES:


HOSPITALIZATIONS:


FAMILY HISTORY:


SOCIAL HISTORY:


ROS:


VITALS:


EXAMINATION:
General: Appears non-toxic

HEENT: TONSILS hypertrophic, and erythematous. MOUTH buccal mucosa, 
moist. PHARYNX indurated, and angry. NOSE turbinates, with no obstuction.

Neck: NECK Supple, with no lymphadenopathy, thyromegaly, or masses.
CVS: HEART RRR s M
Chest: ANTERIOR LUNGS clear bilat


ASSESSMENTS:
391.9 Rheumatic heart disease, unspec.


TREATMENT:


PROCEDURES:


IMMUNIZATIONS:


IMAGING:


LABORATORY:


EDUCATION:
None.

REFERRALS:
Non contributory.

FOLLOWUP:


SUPERBILL:
That was sent to me today as an example
Steeve
16-May-2009
[3732]
Hmm...
Maxim
16-May-2009
[3733x3]
implementing later solution... this is easier
here you go  :-)


data: {CC:
Patient complains of sore throat.

HPI:
ONSET: Sudden, TIMING: Constant, DURATION: 3 days

INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position

CURRENT MEDICATIONS:
TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain"
cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm"

MEDICAL HISTORY:
Rheumatic heart disease, unspec. 391.9
Eczema, atopic dermatitis 691.8
dyslipidemia

ALLERGIES:
Penicillin - allergy: Allergy
Penicillin - allergy: Allergy
Penicillin - anaphylactic reaction
lovastatin - allergy: allergic
macrodantin - 1 po BID

SURGERIES:
}

data: parse/all data "^/"


header-lbl: ["CC" | "HPI" | "ONSET" | "INTENSITY" |"CURRENT MEDICATIONS" 
| "MEDICAL HISTORY" | "ALLERGIES" | "SURGERIES"]

spec: []
foreach line data [
	unless parse/all line [
		copy hdr [header-lbl ":"]
		here:
		(

   append spec to-set-word head remove back tail replace/all hdr " " 
   "-"
			append spec copy/part here tail line
		)
	][
	
		if string? item: last spec [
			append item line
		]
	]

]

probe context spec
ok for you?
Steeve
16-May-2009
[3736]
Assuming SRC: contains the source text, it seems to work too:

header-char: complement charset "^/:"
EOL2: rejoin [newline newline]
parse/all src [
	some [
		some [pos: #" " (change pos #"-") | header-char]
		#":" pos: newline (change/part pos " {" 1)
		[to EOL2 | to end] pos: (change pos "} ") skip skip
	]
]
probe construct to block! src
Graham
16-May-2009
[3737x2]
Yes ... but I'm going to have to study Steeve's
to see why it doesn't work yet
Steeve
16-May-2009
[3739]
it will not work if you have CRLF insteed of newlines in the source.
Is that the case ?
Graham
16-May-2009
[3740]
I just copied it from here.
Steeve
16-May-2009
[3741]
i mean for your source data, not for my code
Graham
16-May-2009
[3742]
that's what I meant .. I just copied the source data from here.