World: r3wp

Join the discussions in the REBOL3 world...

[Parse] Discussion of PARSE dialect

older newer	first last
ChristianE 17-Apr-2010 [4940]	That's said too much; I think it's more that CHANGE/PART behaves as advertised and the /PART refinement just happens to have a different meaning for INSERT or APPEND. Neither one of /WITH, /TO, /SPAN and /RANGE communicate very well that they refer to the second argument though, and /TAKE has the drawback of suggesting that it's taking away from the second argument like TAKE instead of leaving the second argument untouched. CHANGE/FROM, however, seems to work: >> head change/from #abcdef #123456 3 == #123def >> head change/part/from #abcdef #12345 1 3 == #123bcdef All that under the assumption that for compatibility, /PART in it's current meaning will stay as it is.
BrianH 17-Apr-2010 [4941]	It's funny, I always thought INSERT/part was the weird one, and CHANGE/part the normal one. Didn't stop me from adding /part to APPEND though, in the INSERT style.
Maxim 17-Apr-2010 [4942x2]	I agree with Christian, except that /from doesn't convay the proper meaning... another refinement name might be better... something like change/part/only to from 3 4 change/part/amount to from 3 4 ?
Maxim 17-Apr-2010 [4942x2]	except that /only is already used... but I'm just suggesting in the lexical sense... something closer to the meaning of the refinement.
BrianH 17-Apr-2010 [4944]	That's why I suggested /span or /range :)
Steeve 17-Apr-2010 [4945x2]	i suggest /?
Steeve 17-Apr-2010 [4945x2]	it's short
Maxim 17-Apr-2010 [4947]	/? !?!!??!! and meaningless ;-)
Steeve 17-Apr-2010 [4948]	meaningall
Tomc 18-Apr-2010 [4949]	change/interval ? (spelled corectly if necessary)
Gregg 19-Apr-2010 [4950]	Comment added: http://curecode.org/rebol3/ticket.rsp?id=1570&cursor=5#comments
Steeve 19-Apr-2010 [4951x3]	Gregg, I used to use append/part to avoid the memory overhead of copy/part in many case. Instead of doing like in the Ladislav's example. >> change/part something copy/part something-else range part. I used to do. >> change/part something append/part clear #{} something-else range part. It's not faster, but saves memory. So, I don't know if it's a good idea to discard this use case from append and insert.
	Esp in R3
	(It saves memory, if the same code is called many times, indeed)
Ladislav 19-Apr-2010 [4954]	Re "it saves memory" - it is not expected to save memory (the GC should handle such code "smoothly")
Steeve 19-Apr-2010 [4955x2]	Sometimes, I can't let the GC acts by himself because it's too late and tens of MB would be allocated for nothing.
Steeve 19-Apr-2010 [4955x2]	But I agree it's rare cases, with intensive computations. Rare, but it exists.
Ladislav 19-Apr-2010 [4957x2]	It does not matter that it is rare: if you can find any unexpected of the GC, you should put it to CureCode as a major bug
Ladislav 19-Apr-2010 [4957x2]	unexpected behaviour of the GC
Steeve 19-Apr-2010 [4959]	It's not a bug to my mind, the GC never acted smoothly.
Ladislav 19-Apr-2010 [4960x2]	maybe I just misunderstood, then. If it is not a bug, then you are actually saying, that the GC collects everything as expected? If that is the case, then why the trouble to "save memory"?
Ladislav 19-Apr-2010 [4960x2]	(I just tested, and your example is much slower than the code allocating and GC-ing the new string)
Steeve 19-Apr-2010 [4962]	Yeah it's true, it's slower. But if your app contains many loops with many copy/part at different locations, the memory may grow insanly before the recycle. I saw that in many graphic apps with Rebol.
Ladislav 19-Apr-2010 [4963]	I saw that in many graphic app with Rebol - are you sure it was "before the recycle"?
BrianH 19-Apr-2010 [4964x2]	Sometimes you don't want to put too much pressure on the GC, and sometimes you don't want to increase the total size of the pool too much, because that pool doesn't always get returned to the OS very quickly or at all. This is the motivation for additions like the /into option.
BrianH 19-Apr-2010 [4964x2]	We'll see how much optimizations like that need to be undone once we have to adjust for task safety :(
Maxim 19-Apr-2010 [4966x2]	the GC doesn't return the pool... only image data is ever returned AFAIK.
Maxim 19-Apr-2010 [4966x2]	and the GC doesn't kick in too quick or it would be really slow (just try recycle/torture to see ;-) so when you're doing serious work it REALLY grows... although it stabilizes for example although stats often show 10MB... my OS tells me that its actually using 24 MB. that will never shrink back down.
florin 24-May-2010 [4968]	Is there a place for the newbie questions on parsing?
Terry 24-May-2010 [4969]	You've come to the right place.
florin 24-May-2010 [4970x2]	I've created my very first script. The script loops through a list of email (Kerio) log files, extracts the IP addresses, compiles them in a list and adds them to a (Peerblock) list in order to limit incoming spam. I find rebol perfect for this.
florin 24-May-2010 [4970x2]	So an entry in the log file starts like this: "[15/May/2010 17:59:56] IP address 190.101.1.10 found in DNS blacklist SpamHaus SBL-XBL..."
Terry 24-May-2010 [4972]	aye
florin 24-May-2010 [4973x3]	Improve the script by reading only the latest entries in the log, and I pare the date like this: parse/all txt [thru "[" copy found to "]" ]
	So I get the job done. This is the question: If I do parse/all so that spaces are not automatically included, how to I include the space in my parse rule?
	A rule can be: "=," etc. How do I "escape" the space character so that I can include in my rule?
Terry 24-May-2010 [4976x2]	I've always used the spaces as delimeters
Terry 24-May-2010 [4976x2]	parse k [thru "[" copy date to " "]
NickA 24-May-2010 [4978]	I've used parse/all, and then used 'trim on the results.
florin 24-May-2010 [4979]	Yes, that is exactly what I did and it works. However, for the sake of learning, how do I use the the space character as part of my rule?
Steeve 24-May-2010 [4980]	don't see your point, show us the annoying rule...
florin 24-May-2010 [4981x5]	Ok, I will. The point is that I want to include the space in my rule. Here's the example:
	digits: charset "0123456789" ip: [some digits "." some digits "." some digits "." some digits ]
	This finds the IP in the log entry. What if I have two ip addresses and I want to pick them at the same time: ip: [some digits "." some digits "." some digits "." some digits __space__ some digits ...etc]
	And the IP addresses are separatered by a space?
	My question really is, how do I escape the space character as one would in regular expressions?
Steeve 24-May-2010 [4986]	you need parse/all
florin 24-May-2010 [4987]	correct, and then, how do you place the space in the rule: {} ?
Steeve 24-May-2010 [4988x2]	#" " or " " or { }
Steeve 24-May-2010 [4988x2]	'space works too
older newer	first last