r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Rebol School] Rebol School

Volker
25-Jun-2007
[452]
altme: clicking the pen.
PatrickP61
25-Jun-2007
[453]
Ahhh so much to learn and not enough time!!!  Thanks for your patience
Ok, on to another issue.


I have a text file as a printable report that contains several pages 
within it.  Each line can be as large as 132 columns wide (or less).

- The literal  "    Page " will begin in column 115 and that indicates 
the start of a printed page.


I want to write a script that will read this text file one-page-at-a-time, 
so I can do some processing on the page.


How do I write a script to load in a single "page"?  I am guessing 
that I need to open a PORT and have rebol read all the lines until 
I get "....Page." in bype position 115.

Any suggestions?
Rebolek
25-Jun-2007
[454]
Volker: OK, variables yes.
Volker
25-Jun-2007
[455x2]
you can use read/lines to have all lines in a block
(except if its megabytes^^)
PatrickP61
25-Jun-2007
[457]
I need to load in a single page at a time, then "process" that page 
before going on to the next page and processing it.


Are you suggesting that I go ahead and read in all the lines of the 
report and then go through that block to identify a page?
Volker
25-Jun-2007
[458x2]
if parse/all[115   " "   "Page" to end] ["its a new page"]
(not testet)
if parse/all  LINE [ 115   " "   "Page" to end] ["its a new page"] 
;..
PatrickP61
25-Jun-2007
[460]
I will check out PARSE and try some examples later.

Thanks for your help.  Will be back later
Geomol
25-Jun-2007
[461x2]
Or you can do something like:

fp: open/lines %file.txt
until [
	line: first fp
	if find skip line 115 "Page" [print "new page"]
	tail? fp: next fp
]
close fp
I think, that'll fail, if the file is empty!
PatrickP61
26-Jun-2007
[463]
Hi everyone,


I want to write out a Ruler line to a text file for a specified length 
of bytes similar to the following for 125 bytes in length:

----+---10----+---20----+---30----+---40----+---50----+---60----+---70----+---80----+---90----+--100----+--110----+--120----+

I tried the following code, but not what I want:             Ruler:	for 
Count 10 125 10 [ prin "----+---" Count ]     I got this instead:

----+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+---Ruler: 
120              Any suggestions?
Volker
26-Jun-2007
[464]
make the string without numbers, put the numbers in. with 'at, 'change/part 
ruler num length? num.
Geomol
26-Jun-2007
[465x4]
str: "----+-----"

for Count 10 125 10 [prin join copy/part str 10 - length? to-string 
Count Count]
It's hard to read, sorry, but it's short. :-)
My version produce a lot of copies of the string. Volker's suggestion 
is better, because it don't have these copies, so doesn't disturb 
the garbage collector too much.
doesn't
PatrickP61
26-Jun-2007
[469]
doesn't what?
Geomol
26-Jun-2007
[470]
I corrected myself.
because it *doesn't* have these copies ...
PatrickP61
26-Jun-2007
[471]
Volker -- I'm a newbie so bear with me,  I don't understand your 
suggestion.  Do you mean I should do this:
Ruler: for Count 10 125 10 [ prin "----+-----"    then what?
Geomol
26-Jun-2007
[472]
You could also go for a combination with one little string, that 
you change (by putting in the number) and print.
PatrickP61
26-Jun-2007
[473]
I'll see what I do with it ...
Geomol
26-Jun-2007
[474x3]
I think, Volker meant, you should make one large ruler of 125 chars.
This is a way without copies:
str: "----+-----"

for Count 10 125 10 [change skip str either Count < 100 [8][7] Count 
prin str]
Do you follow the code?
PatrickP61
26-Jun-2007
[477]
Hi Geomol,  I just signed on.  Will try out the code later -- many 
thanks
Gabriele
27-Jun-2007
[478x2]
prin ["---" count]    not     print "---" count
prin will insert a space though, so you may want to do   print join 
"---" count   instead.
PatrickP61
27-Jun-2007
[480]
Hi All,  
Have any Rebolers dealt with UniCode files?


Here is my situation.  I work on an IBM AS400 that can "port" files 
over to the PC.  Notebook opens it up just fine, but Rebol doesn't 
see it the same way.  If I Cut & Paste the contents of the file into 
an empty notebook and save it, Rebol can see it just fine.  Upon 
further study, I noticed at the bottom of the SAVE AS window that 
Encoding was set to UNICODE for the AS400 file, while the cut & paste 
one was set to ANSI.  


Does Rebol want ANSI text files only, or can it read UNICODE files 
too?
Geomol
27-Jun-2007
[481]
I guess, you have to convert it. I've once build a RebXML format, 
that could be transfered to/from XML. I can handle utf-8. You can 
find code to convert from utf-8 here: http://home.tiscali.dk/john.niclasen/rebxml/xml2rebxml.r
(search for unicode)

The other way can be found here: http://home.tiscali.dk/john.niclasen/rebxml/rebxml2xml.r
(search for iso2utf-8)
PatrickP61
27-Jun-2007
[482]
Thanks Geomol,  Since I am a newbie, I can easily resave the files 
as ANSI instead of UNICODE and avoid the conversion problem, at least 
in the short term.  Once I get my "Convert to Table" program working, 
then I can look at your links to convert from UNICODE.
Gregg
27-Jun-2007
[483x2]
rejoin extract my-unicode-string 2
Obviously simplistic, just throwing away they extra byte for each 
char.
PatrickP61
27-Jun-2007
[485]
Hi Gregg -- So should I do something like this:  InText: rejoin extract 
Read InFile 2
Gregg
27-Jun-2007
[486]
Try it in the console and see what you get. The console is your friend. 
:-)
PatrickP61
27-Jun-2007
[487]
It works!!!!  Code to convert UNICODE to 

InFile:	%"Test In unicode.txt"
InText: rejoin extract Read InFile 2
write OutFile InText
Geomol
27-Jun-2007
[488]
I'm not too much into unicode. Is that utf-16, where every char is 
2 byte? I think, my scripts can only handle utf-8.
PatrickP61
27-Jun-2007
[489x2]
When you try to save a document under Notebook, the encoding choices 
are UTF-8, UNICODE, ANSI among others.  UNICODE may be the same as 
UTF-16 because it does look like every single character is saved 
as two bytes.  


The code (rejoin extract read InFile 2) does eliminate the double 
characters but I noticed that the entire file is still double spaced 
-- as if the newline is coded twice and not removed from the rejoin. 
 But that extra newline may be an annoyance than anything else.
Hello my teachers.  Is there a more elegant way to create a ruler 
than this in rebol...

Str7:	Str8:	"" 

Ruler:	rejoin [	for Count  10  90  10 [ Str8: rejoin [ Str8 "....+..." 
Count ] ]

  for Count 100 250  10 [ Str7: rejoin [ Str7 "....+.."  Count ] ]
	] 
print Ruler
Gregg
28-Jun-2007
[491x3]
I don't know about more elegant, but here's a func, just for fun.
make-ruler: func [count /local res str-ct offset] [
    res: head insert/dup copy "" "....+....+" count
    repeat ct count [
        str-ct: form ct * 10
        offset: subtract length? str-ct 1
        change at res ct * 10 - offset str-ct
    ]
    res
]
To match your ruler, do:  make-ruler 25
PhilB
28-Jun-2007
[494]
Patrick ... on your AS400 problem .... how is the data transferred 
to the PC?  Is it directly from an AS400 file via the data transfer 
utility built into, or is it a file from the IFS ?

(I have used Rebol to read data transferred from an AS400 and didnt 
get the data as unicode.)
PatrickP61
28-Jun-2007
[495]
Hi PhilB -- The formatted text report is generated on the AS400 into 
the work spool area.  I then can use the INavigator software on the 
PC to connect to it and drag and drop it on the PC, where I can look 
at it via Word or Notebook.  I'm not sure where the encoding to UniCode 
is happening -- I suspect the INavigator software, but then, it may 
not be an issue since Rebol can convert it to readable text, even 
with the extra newline between each line, I'm sure that annoyance 
can be overcome too.
Anton
28-Jun-2007
[496x3]
Patrick, on the double newlines. Can you inspect the result of   
read InFile ? How many newlines are present at that point ?
Useful rebol words:
	NEWLINE	; this is the newline character that rebol uses
	CR	; carriage return character
	LF	; linefeed character
	CRLF	; both CR and LF in a string
There is READ and READ/BINARY

READ is text mode and translates line terminators automatically from 
the target system into rebol's format, which is the same as unix 
(using LF).
I don't think EXTRACT is at fault, it does a very simple job, getting 
every second character.
PatrickP61
28-Jun-2007
[499x3]
Hi Anton -- This is my simulated input for a unicode text file:
	Line1...10....+...20....+...30....+...40....+...50
	Line2...10....+...20....+...30....+...40....+...50
If I run this code:
	InFile:		%"Small In unicode.txt"

 InText:	rejoin extract read InFile 2	; Convert from UNICODE to ANSI 
 but keeps double spacing.
	OutFile:	%"Test Out.txt"
	write OutFile InText
	print InText	
I get these results
	˙Line1...10....+...20....+...30....+...40....+...50
	
	Line2...10....+...20....+...30....+...40....+...50
	

I get them in the output file when I use the Rebol editor, and in 
notebook (when I open the file) and I get them in console when PRINT 
InText.
Notice the spanish y at the beginning of the output
At first, I thought it just be some stray bytes comming from the 
AS400, but I was able to re-create a file using Notebook and get 
same results.
Any of you should be able to test this out by:
1.  Open Notebook
2.  Type in some text
3.  Save the file with Encoding to UNICODE