World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Maxim 16-May-2009 [3701x2] | ok, so they are explicit... then its very easy. |
can you give the name of some the headers... or an example.... so far it looks like a really simple rule to me. | |
Graham 16-May-2009 [3703] | eg. "social history:" |
Maxim 16-May-2009 [3704x2] | and you want the output in neat blocks I guess. |
give me 1 minute | |
Graham 16-May-2009 [3706x3] | so I guess we can masks for each possible header |
^/social history: | |
or apply the rule recursively until it is false | |
Maxim 16-May-2009 [3709] | I can assume it starts at a header? |
Graham 16-May-2009 [3710x2] | might be leading newlines |
or white spaces | |
Maxim 16-May-2009 [3712] | ok, but no content or stray letters? |
Graham 16-May-2009 [3713x2] | shouldn't be yet. |
So, I am trying to create an object from a semi structured document where the object elements are in any order or missing. | |
Maxim 16-May-2009 [3715x3] | almost done... |
ok, so we replace the spaces in the headers by "-" and create an object out of all the code... | |
all the content... rather | |
Graham 16-May-2009 [3718] | I guess I can do it without using parse .. just replace all the headers with a mark, that allows me to split off all the sections, and then i can match the sections with all the section headers. |
Maxim 16-May-2009 [3719] | I'm almost done... I like these little parse tests.. It keeps my mind sharp on using parse ;-) |
Graham 16-May-2009 [3720] | But I don't need parse! :) |
Steeve 16-May-2009 [3721] | are you asleep ? :-) |
Maxim 16-May-2009 [3722] | its working but its skipping the first tag for some reason. |
Graham 16-May-2009 [3723] | Huh? just dozing ... |
Maxim 16-May-2009 [3724x2] | aaahh there is no newline on the start of the text hehehe |
graham, obviously the simplest solution is to read/lines. | |
Graham 16-May-2009 [3726] | read/lines doesn't work on text in memory AFAIK |
Maxim 16-May-2009 [3727] | and just see if the line starts with one of the headers. |
Steeve 16-May-2009 [3728] | what's the content look like ? Can't you just post an example Graham ? |
Maxim 16-May-2009 [3729] | parse text "^/" |
Graham 16-May-2009 [3730x2] | CC: Patient complains of sore throat. HPI: ONSET: Sudden, TIMING: Constant, DURATION: 3 days INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position CURRENT MEDICATIONS: TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain" cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm" MEDICAL HISTORY: Rheumatic heart disease, unspec. 391.9 Eczema, atopic dermatitis 691.8 dyslipidemia ALLERGIES: Penicillin - allergy: Allergy Penicillin - allergy: Allergy Penicillin - anaphylactic reaction lovastatin - allergy: allergic macrodantin - 1 po BID SURGERIES: HOSPITALIZATIONS: FAMILY HISTORY: SOCIAL HISTORY: ROS: VITALS: EXAMINATION: General: Appears non-toxic HEENT: TONSILS hypertrophic, and erythematous. MOUTH buccal mucosa, moist. PHARYNX indurated, and angry. NOSE turbinates, with no obstuction. Neck: NECK Supple, with no lymphadenopathy, thyromegaly, or masses. CVS: HEART RRR s M Chest: ANTERIOR LUNGS clear bilat ASSESSMENTS: 391.9 Rheumatic heart disease, unspec. TREATMENT: PROCEDURES: IMMUNIZATIONS: IMAGING: LABORATORY: EDUCATION: None. REFERRALS: Non contributory. FOLLOWUP: SUPERBILL: |
That was sent to me today as an example | |
Steeve 16-May-2009 [3732] | Hmm... |
Maxim 16-May-2009 [3733x3] | implementing later solution... this is easier |
here you go :-) data: {CC: Patient complains of sore throat. HPI: ONSET: Sudden, TIMING: Constant, DURATION: 3 days INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position CURRENT MEDICATIONS: TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain" cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm" MEDICAL HISTORY: Rheumatic heart disease, unspec. 391.9 Eczema, atopic dermatitis 691.8 dyslipidemia ALLERGIES: Penicillin - allergy: Allergy Penicillin - allergy: Allergy Penicillin - anaphylactic reaction lovastatin - allergy: allergic macrodantin - 1 po BID SURGERIES: } data: parse/all data "^/" header-lbl: ["CC" | "HPI" | "ONSET" | "INTENSITY" |"CURRENT MEDICATIONS" | "MEDICAL HISTORY" | "ALLERGIES" | "SURGERIES"] spec: [] foreach line data [ unless parse/all line [ copy hdr [header-lbl ":"] here: ( append spec to-set-word head remove back tail replace/all hdr " " "-" append spec copy/part here tail line ) ][ if string? item: last spec [ append item line ] ] ] probe context spec | |
ok for you? | |
Steeve 16-May-2009 [3736] | Assuming SRC: contains the source text, it seems to work too: header-char: complement charset "^/:" EOL2: rejoin [newline newline] parse/all src [ some [ some [pos: #" " (change pos #"-") | header-char] #":" pos: newline (change/part pos " {" 1) [to EOL2 | to end] pos: (change pos "} ") skip skip ] ] probe construct to block! src |
Graham 16-May-2009 [3737x2] | Yes ... but I'm going to have to study Steeve's |
to see why it doesn't work yet | |
Steeve 16-May-2009 [3739] | it will not work if you have CRLF insteed of newlines in the source. Is that the case ? |
Graham 16-May-2009 [3740] | I just copied it from here. |
Steeve 16-May-2009 [3741] | i mean for your source data, not for my code |
Graham 16-May-2009 [3742] | that's what I meant .. I just copied the source data from here. |
Steeve 16-May-2009 [3743x2] | ok, it works for me |
i retry | |
Graham 16-May-2009 [3745x3] | working now. |
Actually yours appears to be the better solution because you don't specify the headers | |
and just pick it up from the formmating of the text | |
Steeve 16-May-2009 [3748] | yep |
Graham 16-May-2009 [3749] | well, I'm impressed :) |
Steeve 16-May-2009 [3750] | you should not |
older newer | first last |