Bug in Parse result?
[1/6] from: brock:kalef:innovapost at: 7-Apr-2008 14:10
I'm using view version REBOL/View 1.3.2.3.1 5-Dec-2005 Core 2.6.3
I have the a text file that is tab delimited. I read/lines the data
file which presents the data as shown below if I was to 'probe each line
to the console in a loop;
{7028482^-AVERY, BEVERLY^-86002437^-HILLVIEW}
{7005546^-AVERY, CONNIE^-86511102^-CRYSTAL CITY}
{7021917^-AVERY, MICHELLE^-86000868^-CATALINA}
{7008485^-AVERY, SHEILA^-86002437^-HILLVIEW}
The below code clips have been in use in a monthly data scrub for many
months, approximately a year.
reformat-name: func [name][ ;called by below foreach loop
if not none? name[
replace name ", " tab ;this is the line that isn't being
processed
replace/all name {"} ""
]
return name
]
foreach d data[
parse/all d [
thru {^-} begin: copy employee to {^-} ending:
(
either not none? employee [
new-name: reformat-name employee ;call to the reformat-name
function
change/part begin new-name ending
repend dest-data [d newline]
][
repend excluded [d newline]
]
)
to end
]
]
Essentialy this code simply loops through each data record, changes the
employee names so that the first and last name are tab delimited instead
of CSV. However, it seems the below line is no longer working;
replace name ", " tab
Doing some investigation shows that a search for " " or #" " (a space
string or character) within the string returns 'none.
However if I type do a replace "AVERY, BEVERLY" ", " tab from the
command-line the replace works just fine. So, it seems there is a
problem with the resulting string in that I cannont do a find that
contains a space.
Can anyone confirm this is a bug? I can provide the data file (approx
200kb) and the script I am using for anyone else who wants to confirm
the behaviour in newer views.
[2/6] from: petr:krenzelok:seznam:cz at: 7-Apr-2008 22:25
Hi Brock,
first, it seems to work here in the console. Second - I will try to show
you an alterantive method. If I am correct, what you do is following
process .... skip thru first tab, copy name to employee, you call
special function which replaces comma in the string with tab and then
you compose resulting newlined string.
What about following?
data: read %my-file
result: copy []
foreach line data [
blk: parse str "^-,"
append result rejoin [blk/1 tab blk/2 tab blk/3 tab blk/4 tab blk/5]
]
write/lines %my-new-file result
or - what about crazy one-liner? :-)
->> str: {7028482^-AVERY, BEVERLY^-86002437^-HILLVIEW}
->> replace/all replace trim/lines copy str "," "" " " tab
== {7028482^-AVERY^-BEVERLY^-86002437^-HILLVIEW}
-pekr-
Brock Kalef napsal(a):
[3/6] from: brock:kalef:innovapost at: 7-Apr-2008 16:44
Pekr,
That may be a viable solution. One thing I was trying to avoid was
replacing comma's in the other text field. I just now did a search in
my data and don't see any, so, this will likely work.
My big question is why all of a sudden, does this not work? I haven't
done a code change, and don't believe I have installed a different
version of Rebol on my machine... So I am stumped... It worked before.
I will try again tonight after rebooting.
Thanks for your suggestion.
Brock
[4/6] from: tomc:cs:uoregon at: 7-Apr-2008 14:11
data: [
{7028482^-AVERY, BEVERLY^-86002437^-HILLVIEW}
{7005546^-AVERY, CONNIE^-86511102^-CRYSTAL CITY}
{7021917^-AVERY, MICHELLE^-86000868^-CATALINA}
{7008485^-AVERY, SHEILA^-86002437^-HILLVIEW}
]
alpha: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
rule: [
integer! "^-"
some alpha here: (change/part :here tab 2)
thru end
]
foreach d data [parse/all d rule probe d]
{7028482^-AVERY^-BEVERLY^-86002437^-HILLVIEW}
{7005546^-AVERY^-CONNIE^-86511102^-CRYSTAL CITY}
{7021917^-AVERY^-MICHELLE^-86000868^-CATALINA}
{7008485^-AVERY^-SHEILA^-86002437^-HILLVIEW}
Brock Kalef wrote:
> I'm using view version REBOL/View 1.3.2.3.1 5-Dec-2005 Core 2.6.3
> I have the a text file that is tab delimited. I read/lines the data
<<quoted lines omitted: 43>>
> 200kb) and the script I am using for anyone else who wants to confirm
> the behaviour in newer views.
--
... nice weather eh tomc-cs.uoregon.edu
[5/6] from: santilli::gabriele::gmail::com at: 8-Apr-2008 11:06
On Mon, Apr 7, 2008 at 8:10 PM, Brock Kalef <brock.kalef-innovapost.com> wrote:
> Doing some investigation shows that a search for " " or #" " (a space
> string or character) within the string returns 'none.
Maybe the space you see there is not a normal space? Eg. it could be
^(A0)
, non-breakable space, assuming latin1 encoding.
Regards,
Gabriele.
[6/6] from: brock:kalef:innovapost at: 12-Jun-2008 10:23
Gabriel,
Just ran across this email and you hit it. I did a character by
character compare on the text and found the space characters to not be
the typical ascii code 32. The new text file had some ascii 160
characters representing spaces, so my simple find/replace was
effectively broken.
Thanks everyone for your suggestions.
Brock
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted