r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[I'm new] Ask any question, and a helpful person will try to answer.

[unknown: 5]
3-May-2009
[2068]
In my mind it should but that functionality doesn't exist as far 
as I know.
Maxim
3-May-2009
[2069x2]
when you start to add a single other rule, that will start to fall 
appart.
paul, it is part of possible R3 improvements
mhinson
3-May-2009
[2071]
Ah, so Paul wants to create in effect a "regular expression" and 
use it in the parse.
[unknown: 5]
3-May-2009
[2072]
Good, that will be very welcome.
Maxim
3-May-2009
[2073x3]
yep :-)
the problem is that the above can be extremely slow... just like 
regexp   :-)
be back later ... dont forget the assignment  ;-)
mhinson
3-May-2009
[2076x3]
Thanks again Maxim
Sunanda, I had a look at the parse visualiser yesterday, it looks 
a bit advanced for me yet. A simple version would be good for newbies. 
 I got it to work on his examples, but my examples produced no output. 
 I expect I was doing something foolish.  I will return to it when 
my basic skills are a bit better.
Maxim.  Here is my first attempt at the homework you set me. This 
builds on what you showed me & also relates to the example given 
to me by Pekr.


data: "before first tag <TAG> after 1st pointy tag [TAG] after square 
tag <TAG> after pointy tag 2"

tag-square: "[TAG]"
tag-pointy: "<TAG>"

output: func [tag here] [
	print rejoin ["we are passed the " tag " : '" here "'"]
]

parse/all data [
	some [ 
		[tag-pointy here: (output tag-pointy here) ] 
		| [tag-square here: (output tag-square here) ] 
		| skip
	]
]


I thought it would make the action clearer if the output was in a 
function & the keys used variables.
Ladislav
3-May-2009
[2079x2]
mhinson: your rule:

b: [to "bb" break]


looks quite dangerous. TO means a lot of input may be skipped, which 
is usually not what you want. Moreover, BREAK in that rule is not 
the right place. (it just breaks the rule, but that is totally unnecessary.
(I am not a big fan of BREAK myself, every rule can be written without 
BREAK)
mhinson
3-May-2009
[2081]
Ladislav, thanks for your comments.   It has been suggested that 
I should avoid TO in any backtracking parse until I know much more 
about what I am doing.
Ladislav
3-May-2009
[2082x4]
yes, in your rules you certainly wanted to look for more alternatives, 
than just for "bb". In that case the usage of To is not advisable.
(you have to tell Parse what the alternatives are *before* using 
any cycle construct like TO, SOME, etc.)
so, my guess is, that you wanted something like
b: "bb"
y: "yy"
parse input [any [b | y |  skip]]

or some such, the above would find all occurrences of the parts you 
specified
if you want to find just the first occurrence, then you may use e.g.:

occurrence: [b | y]
parse input [any [occurrence break | skip]...]
Pekr
3-May-2009
[2086]
mhinson: there is simple rule to how to read TO: skip everything, 
unless you find the target. This is not what you wanted in your original 
example, because e.g. TO "b" might also mean, that "b" might be the 
last char of your string. So if you are looking for FIRST occurance 
of [a | b | c], you really have to forget TO and use skip based parsing 
by one char. Hence some [a break | b break | c break | skip] is your 
friend ...
Ladislav
3-May-2009
[2087]
To [a | b | c] may work in the future, but it certainly does not 
work now (although it is not hard to replace using ANY or SOME)
mhinson
3-May-2009
[2088]
I am trying to formulate an example that shows why I thought TO was 
useful.  It mostly has to do with where I want to extract the data 
complete with the key I used to find it.  Without using TO it seems 
that I need to add the string I was looking for back onto the data 
I have extracted.
Ladislav
3-May-2009
[2089x2]
did you have a look at the link Sunanda mentioned? http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse?
aha, you want something like the AT command. That is easy.
[here: "string" :here]
mhinson
3-May-2009
[2091x3]
Yes, that was the firt time I found that as I had not realised the 
Wiki was so extensive. It is a good source.
for this example, what would be the most simple version that returns 
exactly "B2." without using TO?
parse "A1.B2.C3." [to "B" copy result thru "."] print result
assuming the only reference points are "B" and "."
Ladislav
3-May-2009
[2094x2]
b: [here: "B" :here copy result thru "."]
parse input [any [b break | skip]]
you may need some other tricks, like how to make the rule fail, if 
no occurrence was found
mhinson
3-May-2009
[2096]
I could not reply. AltME was broken. I dont understand the syntax 
of the :here statement.
Pekr
3-May-2009
[2097x3]
it's easy - there are so called parse markers. Imagine parse working 
on strings (blocks) at parse level. But then there is underlying 
level, the string (block) itself. You can access the string from 
the parse level either from parens, or by setting markers.
so what code from Ladislav does is:


here:  ; mark the position in the input string. Something like AT. 
Then you can use it in your parens
B
      ; it matches "B". 

:here  ; return back to saved position. So parse input string AT 
position moves back behind the "B"
You can also think about it like:

start: some-matching rules end:  (copy/part start end)
mhinson
3-May-2009
[2100]
So is :here part of the parse dialect? I undedrstand here: but not 
:here
Pekr
3-May-2009
[2101]
yes
mhinson
3-May-2009
[2102]
Should I understand it to move the parser "cursor" back to the position 
of 
index? here
?
Pekr
3-May-2009
[2103x3]
Look at my explanation above, and try to understand it. After here:, 
there is going to be "B" matched. So it means, that index is moved 
past "B". But you want to have your string copied including "B". 
So by issuing :here, you put parser to the saved position.
exactly ...
... you can have many such named markers ...
mhinson
3-May-2009
[2106x3]
I can apreciate the usefullness of that.
I am not sure why the BREAK is needed in the example from Ladislav 
above. Is it to force the rule to return true when the "B" and "." 
matches are found to prevent it carrying on looking for a second 
match further down the string?
Testing that idea out looks as if I have stumbled on the right answer. 
 Maybe there is hope for me yet.
Pekr
4-May-2009
[2109]
'break is needed in Ladislav's code imo because after first match 
of "B" you want to escape (break from) repetitive 'any block, and 
continue your processing with furhter rules (which is not the case 
with Ladislav's example, but is the case with your example, where 
'copy followed. If there would be no break, after matching "B", the 
rule would still succeed, because if there is no "B", then there 
is always "skip option, which is always valid until the end of the 
script. So actually without the 'break, this 'any block would 'skip 
till the end of input string is reached ...
Ladislav
4-May-2009
[2110]
re Break: it is used to make sure the "B"..."." is processed just 
once. If you need to process many such parts, then don't use Break
mhinson
4-May-2009
[2111]
I have been working out ways to extract IP addresses from a string 
today.  Is this a good way to do it? What could catch me out?


parse to-block "junk 111.111.111.111 0.0.0.0 255.255.255.128 junk" 
[
  any [
    set tup tuple! (print tup)
    | skip
  ]
]
Oldes
4-May-2009
[2112]
It depends, what the junk can be.. in your case it must be REBOL 
loadable.
mhinson
4-May-2009
[2113]
I was hoping the TO-BLOCK would take care of that. do I need to parse 
the junk first  to remove unloadable strings?  or is there another 
TO- function that will do it for me please?
Oldes
4-May-2009
[2114]
This should be safe:
use [
	ch_numbers
	ch_rest
	rl_ip
	ip-start
	ip-end
	ips
][
	ch_numbers: charset "0123456789"
	ch_rest: complement ch_numbers
	ips: copy []
	rl_ip: [
		ip-start:
		 some ch_numbers #"."
		 some ch_numbers #"."
		 some ch_numbers #"."
		 some ch_numbers
		ip-end:
		(error? try [append ips to-tuple copy/part ip-start ip-end])
	]
	set 'get-ips func [str][
		clear ips
		parse/all str [
			some [
				any ch_rest
				rl_ip
			]
		]
		ips
	]
]

get-ips "err,.;s 111.111.111.111 0.0.0.0 255.255.255.128 junk"
mhinson
4-May-2009
[2115]
Thanks or that snippet, sounds as if you have been on this trail 
before. Thanks.
Oldes
4-May-2009
[2116]
sorry.. :
some [
				any ch_rest
				rl_ip
				| skip
			]
so it handles cases like:
get-ips "err,.;s 111.111.111 0.0.0.0 255.255.255.128 junk"
mhinson
4-May-2009
[2117]
clever, defo better than my simple tuple search. Thanks.