r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[I'm new] Ask any question, and a helpful person will try to answer.

mhinson
17-Apr-2009
[1757]
Perhaps I should go back to trying to form a program specification 
& see if the advice I get in that context is different. 
If I have
print "hello world"

that seems to follow syntax rules shown by "source print"   are you 
saying because I could have
>> hi: [print "hello world"]
== [print "hello world"]
>> do hi
hello world

I have started using a dialect?
Henrik
17-Apr-2009
[1758x2]
No, when using DO, it will not be a dialect, just normal REBOL code.


Before you do anything with it, the block is just a chunk of data. 
A dialect involves some kind of processor that you write or exists 
in REBOL already, which you then apply to the chunk of data, but 
is not the base scanner (the main language parser).
An example where DO wouldn't work would be VID (the graphics user 
interface system, Visual Interface Dialect)

do [button "Hello world!"] ; gives an error


layout [button "Hello world!"] ; returns a meaningful result, because 
the block was parsed as a dialect.
mhinson
17-Apr-2009
[1760x2]
Sorry, I am not getting it at all.
what you seem to be showing me look like what I would call functions 
or procedures.  I cant understand the destinction yet.
sqlab
17-Apr-2009
[1762x2]
To make it short
It is like a new language in your programming language.

It can be just an add-on to your normal language with enhancement 
or a totally different language
So parse has it's own language, that sometimes resembles to rebol, 
sometimes is different.
And you can make your own languages or dialects too
mhinson
17-Apr-2009
[1764x2]
it  sounds like a very flexiable concept, but likely to add complexity.

parse seems to be well documented in terms of how a string can be 
split apart in this manner
>> probe parse {Hello world} none
["Hello" "world"]

but much less documented when trying to do complex stuff...  I was 
in my ignorance expecting it to follow some sorts of syntax rules 
that I could read about.  Have I missed a basic concept?
I am only thinking about finding & extracting data here, not parsing 
for commas or html tags etc.
sqlab
17-Apr-2009
[1766]
http://www.rebol.com/docs/core23/rebolcore-15.html
Henrik
17-Apr-2009
[1767]
I know the following sounds basic, but it's _crucial_ to understanding 
how REBOL works, otherwise you will not "get" REBOL:


You must understand the concept that data is code and code is data. 
That means that anything you write, can be considered pure data. 
It's what you do with that data that becomes important.


It's like speaking a sentence, but only paying attention to the placement 
of letters. That makes the sentence pure data. If you then pay attention 
to the words objectively, they can form a sentence, so you can validate 
its syntax. If you use the sentence in a context, you can apply meaning 
to it, subjectively. If you switch the context, the sentence can 
mean something entirely different.

This is very important. Context and meaning.

For REBOL:

[I have a cat]


This is a block with 4 words. It's pure data that can be stored in 
memory, but at that level it doesn't make any sense to REBOL.


If you then apply a function to that data, you can process it. DO 
processes that data as REBOL code. It will be evaluated as REBOL 
code. Here it will produce an error, because it's not valid REBOL 
code. If you produce your own dialect, for example with PARSE, you 
can make that block make sense.


When typing in the console, REBOL evaluates it as normal REBOL code 
by using DO internally. That means:

>> now
== 17-Apr-2009/18:13:14+2:00

is the same as:

do [now]

But this block:

[now]

is just pure innocent data with no meaning.
mhinson
17-Apr-2009
[1768]
you are all very kind to spend so much time helping me with this.
Brock
17-Apr-2009
[1769x3]
7.4 Marking Input:

  in the link provided by sqlab explains the use of set-words in the 
  parse dialect.  I needed to use this technique to strip out large 
  comments from web logs that I was parsing and passing into a database. 
   I was able to remove the large comment and replace it with a default 
  string indicating "comments have been removed"
One of the methods I used to parse lines of data was as follows:
lines: read/lines %file.txt
foreach line lines[
	parse line [parse rule here]
]
This way you are only dealing with a small amount of data for each 
parse and might make it easier to visualize for you.
Pekr
17-Apr-2009
[1772]
mhinson: now my explanations to some of your questions, as I think 
not everything was explained to you properly:


1) parse/all - /all refinement means, that string is parsed "as-is", 
because without the /all, white-space is skipped:

>> parse "this white dog" ["this" "white" "dog"]
== true
>> parse/all "this white dog" ["this" "white" "dog"]
== false
>> parse/all "this white dog" ["this" " " "white" " " "dog"]
== true

I prefer to always use /all refinement for string parsing ...


2) i don't understand why there is | before "some", that code will 
not work imo ...


3) "ifa:" is a marker. Think about parse in following terms ... you 
have your data, here a string. Parse is the matching engine, which 
tries to match your input string according to given rules. In parse 
context (dialect) you have no means of how to manipulate the input 
string, except the copy. So markers are usually used, when you want 
to mark some position, then do something in parens, and then get 
back the position, or simply mark start: .... then somewhere later 
end: and in the paren (copy/part start end) to copy the text between 
the two marked positions ...


4) "skips till one of the OR conditions are met" - very well understood 
...


5) Here's slight modification for append/only stuff. Type "help append" 
in the console. /only appends block value as a block. You will understand 
that, once you will need such behaviour, so far it can look kind 
of academic to you :-) I put parens there, to make more obvious, 
what parameters are consumed by what function ....

>> wanted: copy []
== []

>> append  (append wanted (copy/part "12345" 3))  interf:  copy ["abc"]
== ["123" "abc"]
>> wanted: copy []
== []

>> append/only  (append wanted (copy/part "12345" 3))  interf:  copy 
["abc"]
== ["123" ["abc"]]
mhinson
17-Apr-2009
[1773]
Thanks very much again for so much help, I am very gratefull for 
the time you have spent helping me with this.

A bit of a light is beginning to come on.. so outside of the parse 
dialect we have this syntax
result: copy "hello"
but inside parse we have a different syntax for copy 


Once I realised that I felt much less confused & set about experimenting 
with to & thru in the context of copy within parse.

Perhaps these results will be of interest to other noobs, although 
I mustt say actualy typing them in helped me appreciate what was 
happening.

  parse {ab hello cd} [copy result "a" "o"]   ;; returns "a"
  parse {ab hello cd} [copy result to "a" "o"]   ;; returns none
  parse {ab hello cd} [copy result "a" to "o"]   ;; returns "a"

  parse {ab hello cd} [copy result to "a" to "o"]   ;; returns none
  parse {ab hello cd} [copy result thru "a" "o"]   ;; returns "a"
  parse {ab hello cd} [copy result "a" thru "o"]   ;; returns "a"

  parse {ab hello cd} [copy result thru "a" thru "o"]   ;; returns 
  "a"

  parse {ab hello cd} [copy result "h" "o"]   ;; returns []
  parse {ab hello cd} [copy result to "h" "o"]   ;; returns "ab "
  parse {ab hello cd} [copy result "h" to "o"]   ;; returns []

  parse {ab hello cd} [copy result to "h" to "o"]   ;; returns "ab 
  "

  parse {ab hello cd} [copy result thru "h" "o"]   ;; returns "ab h"
  parse {ab hello cd} [copy result "h" thru "o"]   ;; returns []

  parse {ab hello cd} [copy result thru "h" thru "o"]   ;; returns 
  "ab h"

  parse {ab hello cd} [copy result ["a" "o"]]   ;; returns []
  parse {ab hello cd} [copy result [to "a" "o"]]   ;; returns []

  parse {ab hello cd} [copy result ["a" to "o"]]   ;; returns "ab hell"

  parse {ab hello cd} [copy result [to "a" to "o"]]   ;; returns "ab 
  hell"

  parse {ab hello cd} [copy result [thru "a" "o"]]   ;; returns []

  parse {ab hello cd} [copy result ["a" thru "o"]]   ;; returns "ab 
  hello"

  parse {ab hello cd} [copy result [thru "a" thru "o"]]   ;; returns 
  "ab hello"

  parse {ab hello cd} [copy result ["h" "o"]]   ;; returns []
  parse {ab hello cd} [copy result [to "h" "o"]]   ;; returns []
  parse {ab hello cd} [copy result ["h" to "o"]]   ;; returns []

  parse {ab hello cd} [copy result [to "h" to "o"]]   ;; returns "ab 
  hell"

  parse {ab hello cd} [copy result [thru "h" "o"]]   ;; returns []

  parse {ab hello cd} [copy result ["h" thru "o"]]   ;; returns []

  parse {ab hello cd} [copy result [thru "h" thru "o"]]   ;; returns 
  "ab hello"


  parse {ab hello cd} [copy result [thru "a" to "o"]]   ;; returns 
  "ab hell"

  parse {ab hello cd} [copy result [thru "h" to "o"]]   ;; returns 
  "ab hell"

  parse {ab hello cd} [copy result thru "a" to "o"]   ;; returns "a"

  parse {ab hello cd} [copy result thru "h" to "o"]   ;; returns "ab 
  h"


  parse {ab hello cd} [copy result [to "a" thru "o"]]   ;; returns 
  "ab hello"

  parse {ab hello cd} [copy result [to "h" thru "o"]]   ;; returns 
  "ab hello"

  parse {ab hello cd} [copy result to "a" thru "o"]   ;; returns none

  parse {ab hello cd} [copy result to "h" thru "o"]   ;; returns "ab 
  "


  parse {ab hello cd} ["h" copy result "o"]   ;; returns []
  parse {ab hello cd} [to "h" copy result "o"]   ;; returns []
  parse {ab hello cd} ["h" copy result to "o"]   ;; returns []

  parse {ab hello cd} [to "h" copy result to "o"]   ;; returns "hell"
  parse {ab hello cd} [thru "h" copy result "o"]   ;; returns []
  parse {ab hello cd} ["h" thru copy result "o"]   ;; returns []

  parse {ab hello cd} [thru "h" copy result thru "o"]   ;; returns 
  "ello"

  parse {ab hello cd} ["a" copy result "o"]   ;; returns []
  parse {ab hello cd} [to "a" copy result "o"]   ;; returns []

  parse {ab hello cd} ["a" copy result to "o"]   ;; returns "b hell"

  parse {ab hello cd} [to "a" copy result to "o"]   ;; returns "ab 
  hell"
  parse {ab hello cd} [thru "a" copy result "o"]   ;; returns []

  parse {ab hello cd} ["a" copy result thru "o"]   ;; returns "b hello"

  parse {ab hello cd} [thru "a" copy result thru "o"]   ;; returns 
  "b hello"



  parse {ab hello cd} [thru "a" copy result to "o"]   ;; returns "b 
  hell"

  parse {ab hello cd} [thru "h" copy result to "o"]   ;; returns "ell"


  parse {ab hello cd} [to "a" copy result thru "o"]   ;; returns "ab 
  hello"

  parse {ab hello cd} [to "h" copy result thru "o"]   ;; returns "hello"
Oldes
17-Apr-2009
[1774x2]
parse/all {ab hello cd} [3 skip copy result 5 skip to end]     ;; 
returns "hello"

parse/all {ab hello cd} [thru #" " copy result to #" " to end]  ;; 
returns "hello"
(you don't have to use the [to end] in my versions if you don't care 
that parse returns false instead of true
mhinson
18-Apr-2009
[1776]
I have written my first bit of code that is starting to do something 
useful.

All the bad bits are mine & all the good bits are from the help I 
have been given here.

My main intention is to start off with code that I can understand 
& develop so any criticism would be most welcome.


My next step is to remove the debug code & replace it with code that 
stores all the information in a structured form for searching & further 
analysis.
Thanks for all your help with this.

filename: copy %/c/temp/cisco.txt   ;; cisco config file
host: copy []
interface: copy []
intDescription: copy []
intIpAddress: []
ipRoute: []
IntFlag: false
spacer: charset " ^/"
name-char: complement spacer

lines: read/lines filename

foreach line lines [          ;; move through lines

 parse/all line [copy temp ["interface" to end] ( ;; evaluated if 
 "interface" found preceeded by nothing else
		interface: copy temp 
		print interface           ;; debug code
		IntFlag: true)   

 | copy temp2 [" desc" to end] (  ;; evaluate if " desc" found preceeded 
 by nothing else
		if IntFlag [print temp2]     ;; debug
	) 
	| copy temp3 [" ip address" to end] ( ;; " ip address"
		print temp3    ;; debug
	)
	| copy temp4 ["hostname" to end] ( ;; "hostname"
		print temp4      ;; debug
	)

 | copy temp5 [name-char to end] ( ;;  any char except space or newline. 
 this must be last 
;		if IntFlag [print temp5]      ;; debug
		if IntFlag [print "!"]      ;; debug
		IntFlag: false
	)
	] 
]

;######################################

 the input file contains these lines which are extracted (except the 
 !) plus it has a load more lines that are ignored at the moment.

hostname pig
interface Null0
!
interface Loopback58
 description SLA changed this
!
interface ATM0
!
interface ATM0.1 point-to-point
!
interface FastEthernet0
 description my first port
!
interface FastEthernet1
 description test1
!
interface FastEthernet2
 description test2
!
interface FastEthernet3
!
interface Dot11Radio0
!
interface Vlan1
 description User vlan (only 1 vlan allowed)
!
interface Dialer0
 description $FW_OUTSIDE$
 ip address negotiated
!
interface BVI1
 description $FW_INSIDE$
 ip address 192.168.0.1 255.255.255.0
!
!########### end ##########
Steeve
18-Apr-2009
[1777]
What a waste...
Are you sure you understand well the idea behind parsing ?

It's not specific to rebol, Parsing exists in many computer langages, 
At first you have to understand the theory behind...
If not, you will just produce trash code.like that
[unknown: 5]
19-Apr-2009
[1778]
mhinson, don't be discouraged by Steeve's lack of politeness.  I 
assure you that we are not all this way.  Just be sure to ask questions.
Pekr
19-Apr-2009
[1779]
Steeve - why a waste? REBOL's parse allows even lamers like me to 
produce the result, which in the end does what I want it to do, but 
you surely would not like to see my parse rules :-) I can't write 
single piece of regexp, yet REBOL's parse is usefull to me.
Henrik
19-Apr-2009
[1780x2]
it would be interesting if the config file could be loaded, by making 
unloadable parts like $FW_OUTSIDE$ loadable using simple string replacable. 
Then you could just 'load the file into a block and it would be considerably 
easier to parse.
string replacable = string replacement
Steeve
19-Apr-2009
[1782]
Pekr, I'm just disapointed by what mhinson produced after getting 
so many advices from Rebolers like you.

The read/line trick is very useless, why doesn't he use the standard 
way of traversing newlines with parse ? 
And why using code (inside parents) to manage optional rules.

Are commands like SOME, ANY, OPT not enough to manage simple rules 
like that ?
Henrik
19-Apr-2009
[1783]
I'm just admiring that mhinson wasn't scared of jumping into PARSE 
so soon. :-)
mhinson
19-Apr-2009
[1784]
Thanks for the feed back, all is most welcome. I will try to avoid 
read/line if it is bad, is there a list of things I can't expect 
to load?  Should I convert them to some symbolic value & then convert 
them back again for the final output?

I don't yet understand why a block would be easier to parse than 
lines, by easier do you mean more efficient or easier to create the 
code?

The optional rules (inside parents) are to change the behavior based 
on lines read previously so I don't yet understand any concept that 
would let me avoid those. 

I need the code to be very simple (like me) so I can understand how 
it is operating. I know my implementation goes against the Rebol 
ethos of  small & efficient but perhaps in time I can understand 
enough to make it so & also start using relative expressions properly 
so it can be simple to understand.
Henrik
19-Apr-2009
[1785]
I don't yet understand why a block would be easier to parse than 
lines, by easier do you mean more efficient or easier to create the 
code?


Yes, it's easier, because REBOL is based around this concept. Without 
this concept, dialects wouldn't make much sense. Your configuration 
file shown above is a good candidate for a dialect with some tweaks.


I suggest, you read again what I wrote above about the basics of 
words, context and meaning. I can't emphasize enough how important 
this is to REBOL and especially for block parsing. It's important 
to understand this before moving on, or REBOL will remain difficult 
to use for you. Or drop your parse project for now and stick with 
the basics until you understand more of REBOL.

is there a list of things I can't expect to load?


The LOAD function will error out, if a string (such as a file you 
read) can't be loaded into a block of data. 

Try these in the console, and see which ones work:

load "hello"

load "hello,"

load "hello."

load "%"

load "1 2 3 4"


load "hostname pig interface Null0 ! interface Loopback58 description 
SLA changed this"

load %/c/temp/cisco.txt
sqlab
19-Apr-2009
[1786x2]
I am against loading this configuation files. Why?
-you can not control what is inside 
we know already, that there are elements unloadable by Rebol
and the description almost always needs string parsing 

What would I change?

I would either use only one temp variable in the parse rule and after 
just set to the new variable, as there is already a copy involved 
 or I would use a meaningfull variable name in the first place
Another reason against loading
- we can not determine, if "interface" is at the start of a line
mhinson
19-Apr-2009
[1788]
Good point about my temp1 temp2 etc. that was sloppy.

It is true I cannot control what is inside the config files. They 
can contain any printable chars (eg in encrypted password fields 
or remarks/descriptions or embedded TCL code) and sometimes I am 
going to want to capture that text. 

I don't mind trying to do both methods as it will help me learn. 
Since I can't load the file directly I am thinking I will need to 
do a read %file.txt & replace the /\,[]%$()@:? with %xx etc.  then 
load the result?   I cant find a list of all the chars that I would 
need to treat like this yet.

Henrik, I do continue to read what you have written, it is helpful 
& I think I am beginning to appreciate the concepts. I am probably 
not as clear as I should be about the specification of what I am 
trying to do so the code has tended towards listing the requirements 
rather than being elegant.  Thanks /\/\
sqlab
19-Apr-2009
[1789]
further enhancements
I would try to extract the rule parts into single rule e.g.
interfacerule: [--]
descriptionrule: [--]
etc. (names are debatable.)


better than [copy ..   [" ip address" to end]] is probably  [to "ip 
address" copy temp to end], 
unless you know, that there is always one space
Sunanda
19-Apr-2009
[1790]
'read/lines is not bad.
It enables you to easily split the problem inyto smaller phases.
It makes it harder to solve the problem in one huge 'parse.

But maybe one big 'parse is not the best approach -- especially if 
you need (say)backtracking for error recovery, or line-numbers of 
points of failure.


You are getting several people's views on how to tackle your problem 
here. Take their advice seriously, but remember you are the domain 
specialist, so you get to choose which solution fits best. Not us 
:-)
mhinson
19-Apr-2009
[1791]
ok. I follow your extraction of rules idea & this is what you had 
in your original suggestion.  Now I am getting more familiar with 
what I am looking at I can understand the benefit of that & will 
start to work that way now.

[copy .. [" ip address" to end]]  was to get the interface address 
in the interface section of the file. It is identified by
1) some line after the line containing "interface"

2) at the begining of the line always starting with one space before 
the word "ip address"

3) before any line with a non-blank first char unless it is a new 
instance of "interface" (hence my IntFlag which Steeve didn't like 
my method of use)


I found from testing that [to "ip address" copy temp to end] or [to 
" ip address" copy temp to end] found the string anywhere in the 
line, but [copy .. [" ip address" to end]] only finds the string 
if it is at the start of the line which is what I was trying to achieve. 
Have I made a mistake here & need to retest my assumptions perhaps?


I always appreciate lots of different views on issues so I am loving 
the multiple responses. 

Sunanda you have reminded me about line numbers. I will tackle them 
after the extraction of rules I think, as I want them in my output 
for data output quality & validation checking.


I have been looking at your parse-ini.r to see how you have read 
a file into a Rebol block, but I may stick with read/line for a bit 
longer while get my head round parsing each line in turn.  I get 
the impression that once I have a final block of code there will 
be someone who can turn it into 2 short lines including a built in 
Easter egg game.
sqlab
19-Apr-2009
[1792]
No, you are right. 

If there is always one leading space after newline identifying a 
valid ip address, your approach  is the best. I just don't know anything 
about the stringent syntax rules of your config files, hence my suggestion.
mhinson
19-Apr-2009
[1793x2]
Thanks.  These files are getting better with more recent versions 
of Cisco IOS but sometimes trial and error is the only way to find 
the formats used.
I am still new and confused.   where can I read about how to do this 
please?
file: "%file.txt"
host: "Router1"
interface: "fa0"
i: 25
description: []
ipaddr: "1.1.1.1 255.255.255.0"


write/append/string %/c/temp/result.log [file tab i tab host tab 
interface tab description tab  ipaddr newline]


I want the output file to be a tab-seperated set of values but all 
I get is the text 
filetabitabhosttabinterfacetabdescriptiontabipaddrnewline
Oldes
19-Apr-2009
[1795x3]
write/append/string %/c/temp/result.log rejoin [file tab i tab host 
tab interface tab description tab  ipaddr newline]
andor just simple  REDUCE instead of REJOIN should be enough in this 
case
Also you must use MOLD for the values, if you want to keep the type 
(for example the block as the description)
mhinson
19-Apr-2009
[1798x2]
Thanks very much I have seen reduce & rejoin & mold but didnt realise 
it was relevant to writing a file.. this is the first time I have 
ever written to a file.
I have tried to understand & take on what I have been told, thanks. 
Is this worse or better. It does what I was looking to do & I know 
how to extend it in the same structure.  I am sure it would be educational 
for me if anyone has time to tear it to shreds please.


Should I stop using read/line now?  Would I get the benefit still? 
Or is the requirement too fragmented for this approach now?
Should I use functions anywhere instead?
Have I initialised my variables in the right & appropriate way?

filename: copy %/c/temp/cisco.txt   ;; cisco config file
outFile: copy %/c/temp/outFile.log  ;; tab separated output
hostname: copy []
interface: copy []
intDesc: copy []
intIpaddr: []
ipRoute: []
IntFlag: false
spacer: charset " ^/"
name-char: complement spacer

lines: read/lines filename


outInterface: [ write/append outFile reduce 

 [filename tab i tab hostname tab interface tab intDesc tab intIpaddr 
 newline]
]

clearInterface: [
	interface: copy []
	intDesc: copy []
	intIpaddr: []
]


interfaceRule: [ ["interface " copy temp-interface to end] (   ;; 
captures point-point as well

  if IntFlag outInterface           ;; start of new interface section 
  so output data collected previously.
		if IntFlag clearInterface        
		interface: copy temp-interface
		print ["! found at line " i]      ;; debug
		print current-line                ;; debug
		IntFlag: true
	)
]

descRule: [ [" description " copy intDesc to end] (  
		if IntFlag [print current-line]     ;; debug
	)
]
	
ipAddrRule: [[" ip address " copy intIpaddr to end] ( 
		print current-line                  ;; debug
	)
]

hostnameRule: [["hostname " copy hostname to end] ( ;; "hostname"
		print current-line                  ;; debug
	)
]

iprouteRule: [copy iproute ["ip route" to end] ( ;; "ip route"
		print current-line                  ;; debug
	)
]



IntFlagRule: [copy tempZZ [name-char to end] ( ;; not space or newline. 
this must be out of the int section

  if IntFlag outInterface      ;; end of interface section so output 
  data collected.
		if IntFlag clearInterface
		if IntFlag [print "!"]       ;; debug
		IntFlag: false               ;; 
	)
]

i: 0


foreach line lines [i: i + 1 ;; move through lines & track line number
	current-line: line       ;; for debug output
	parse/all line [         ;; parse only using rules below

  interfaceRule        ;; evaluated if "interface" found preceeded 
  by nothing else

  | descRule           ;; evaluate if " desc" found preceeded by nothing 
  else
		| ipAddrRule         ;; " ip address"
		| hostnameRule       ;; " hostname"
		| iprouteRule        ;; "ip route"

  | IntFlagRule        ;; unset interface flag if no longer in interface 
  section (no " ^/")
	] 
]
Graham
19-Apr-2009
[1800x7]
filename: copy %/c/temp/cisco.txt   ;; cisco config file
outFile: copy %/c/temp/outFile.log  ;; tab separated output
don't need 'copy there
what's this?
outInterface: [ write/append outFile reduce 

 [filename tab i tab hostname tab interface tab intDesc tab intIpaddr 
 newline]
]
should that be a function and not just a block?
same with clearinterface
which needs a 'copy on the last  [ ]