World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
Maxim 16-May-2009 [3693x2]	spaces add no complication to the system, as long as the headers can be identified without doubt.
Maxim 16-May-2009 [3693x2]	so the rule is : headers start on new line, stop at first ":" all the rest is content?
Graham 16-May-2009 [3695]	now if you have a rule copy text [ to "a:" \| to "b:" .... ] but if b: occurs before a: in the text, then you will include a header in copied text
Maxim 16-May-2009 [3696]	forget to and thru... they are not proper parsing.
Graham 16-May-2009 [3697]	yes, headers start on a newline and terminate in ":"
Maxim 16-May-2009 [3698]	and there can be no ":" within the content?
Graham 16-May-2009 [3699x2]	No, there can be a ":" in the content
Graham 16-May-2009 [3699x2]	but you know what the headers are ... so that's not a big problem.
Maxim 16-May-2009 [3701x2]	ok, so they are explicit... then its very easy.
Maxim 16-May-2009 [3701x2]	can you give the name of some the headers... or an example.... so far it looks like a really simple rule to me.
Graham 16-May-2009 [3703]	eg. "social history:"
Maxim 16-May-2009 [3704x2]	and you want the output in neat blocks I guess.
Maxim 16-May-2009 [3704x2]	give me 1 minute
Graham 16-May-2009 [3706x3]	so I guess we can masks for each possible header
	^/social history:
	or apply the rule recursively until it is false
Maxim 16-May-2009 [3709]	I can assume it starts at a header?
Graham 16-May-2009 [3710x2]	might be leading newlines
Graham 16-May-2009 [3710x2]	or white spaces
Maxim 16-May-2009 [3712]	ok, but no content or stray letters?
Graham 16-May-2009 [3713x2]	shouldn't be yet.
Graham 16-May-2009 [3713x2]	So, I am trying to create an object from a semi structured document where the object elements are in any order or missing.
Maxim 16-May-2009 [3715x3]	almost done...
	ok, so we replace the spaces in the headers by "-" and create an object out of all the code...
	all the content... rather
Graham 16-May-2009 [3718]	I guess I can do it without using parse .. just replace all the headers with a mark, that allows me to split off all the sections, and then i can match the sections with all the section headers.
Maxim 16-May-2009 [3719]	I'm almost done... I like these little parse tests.. It keeps my mind sharp on using parse ;-)
Graham 16-May-2009 [3720]	But I don't need parse! :)
Steeve 16-May-2009 [3721]	are you asleep ? :-)
Maxim 16-May-2009 [3722]	its working but its skipping the first tag for some reason.
Graham 16-May-2009 [3723]	Huh? just dozing ...
Maxim 16-May-2009 [3724x2]	aaahh there is no newline on the start of the text hehehe
Maxim 16-May-2009 [3724x2]	graham, obviously the simplest solution is to read/lines.
Graham 16-May-2009 [3726]	read/lines doesn't work on text in memory AFAIK
Maxim 16-May-2009 [3727]	and just see if the line starts with one of the headers.
Steeve 16-May-2009 [3728]	what's the content look like ? Can't you just post an example Graham ?
Maxim 16-May-2009 [3729]	parse text "^/"
Graham 16-May-2009 [3730x2]	CC: Patient complains of sore throat. HPI: ONSET: Sudden, TIMING: Constant, DURATION: 3 days INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position CURRENT MEDICATIONS: TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain" cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm" MEDICAL HISTORY: Rheumatic heart disease, unspec. 391.9 Eczema, atopic dermatitis 691.8 dyslipidemia ALLERGIES: Penicillin - allergy: Allergy Penicillin - allergy: Allergy Penicillin - anaphylactic reaction lovastatin - allergy: allergic macrodantin - 1 po BID SURGERIES: HOSPITALIZATIONS: FAMILY HISTORY: SOCIAL HISTORY: ROS: VITALS: EXAMINATION: General: Appears non-toxic HEENT: TONSILS hypertrophic, and erythematous. MOUTH buccal mucosa, moist. PHARYNX indurated, and angry. NOSE turbinates, with no obstuction. Neck: NECK Supple, with no lymphadenopathy, thyromegaly, or masses. CVS: HEART RRR s M Chest: ANTERIOR LUNGS clear bilat ASSESSMENTS: 391.9 Rheumatic heart disease, unspec. TREATMENT: PROCEDURES: IMMUNIZATIONS: IMAGING: LABORATORY: EDUCATION: None. REFERRALS: Non contributory. FOLLOWUP: SUPERBILL:
Graham 16-May-2009 [3730x2]	That was sent to me today as an example
Steeve 16-May-2009 [3732]	Hmm...
Maxim 16-May-2009 [3733x3]	implementing later solution... this is easier
	here you go :-) data: {CC: Patient complains of sore throat. HPI: ONSET: Sudden, TIMING: Constant, DURATION: 3 days INTENSITY: Moderate, QUALITY: Burning, MODIFYING FACTORS: head position CURRENT MEDICATIONS: TYLENOL W/ CODEINE NO. 3 300MG;30MG 1-2 po q 4-6 hrs prn "pain" cyclobenzaprine Oral Tablet 10 MG 1 tab po TID prn "muscle spasm" MEDICAL HISTORY: Rheumatic heart disease, unspec. 391.9 Eczema, atopic dermatitis 691.8 dyslipidemia ALLERGIES: Penicillin - allergy: Allergy Penicillin - allergy: Allergy Penicillin - anaphylactic reaction lovastatin - allergy: allergic macrodantin - 1 po BID SURGERIES: } data: parse/all data "^/" header-lbl: ["CC" \| "HPI" \| "ONSET" \| "INTENSITY" \|"CURRENT MEDICATIONS" \| "MEDICAL HISTORY" \| "ALLERGIES" \| "SURGERIES"] spec: [] foreach line data [ unless parse/all line [ copy hdr [header-lbl ":"] here: ( append spec to-set-word head remove back tail replace/all hdr " " "-" append spec copy/part here tail line ) ][ if string? item: last spec [ append item line ] ] ] probe context spec
	ok for you?
Steeve 16-May-2009 [3736]	Assuming SRC: contains the source text, it seems to work too: header-char: complement charset "^/:" EOL2: rejoin [newline newline] parse/all src [ some [ some [pos: #" " (change pos #"-") \| header-char] #":" pos: newline (change/part pos " {" 1) [to EOL2 \| to end] pos: (change pos "} ") skip skip ] ] probe construct to block! src
Graham 16-May-2009 [3737x2]	Yes ... but I'm going to have to study Steeve's
Graham 16-May-2009 [3737x2]	to see why it doesn't work yet
Steeve 16-May-2009 [3739]	it will not work if you have CRLF insteed of newlines in the source. Is that the case ?
Graham 16-May-2009 [3740]	I just copied it from here.
Steeve 16-May-2009 [3741]	i mean for your source data, not for my code
Graham 16-May-2009 [3742]	that's what I meant .. I just copied the source data from here.
older newer	first last