r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Fork
26-Sep-2010
[18491]
/INTO is kind of novel and catchy, in terms of some of the optimization 
scenarios it allows for.  Making variants like "/NO-COPY" winds up 
looking like Rebol is doing a bad job of conventions like REDUCE 
vs REDUCE! ... as opposed to playing a whole different game.
Steeve
26-Sep-2010
[18492]
It comes to my mind that reduce/into should not behave like inserting 
data but changing them.

Changing data if it's it's at the end of the container will simply 
add them.

But if it's at the head of the container, it will remplace them intead.
Probably more usefull, than the current behavior.

so that 
>> reduce/into data data

would simply do the job we all expecting. To reuse the same block. 
A reduce in place (rest in a peace)
BrianH
26-Sep-2010
[18493x2]
We could only add one behavior for the /into option, and the insert 
behavior was the most basic.
Fork, your method would add the additional series allocation to the 
series itself, temporarily doubling its size (at most). And then 
the reallocated series would stay that size in memory even after 
DEDUPLICATE returns. So overall, worse.
Fork
26-Sep-2010
[18495x4]
Didn't you say that  UNIQUE has to copy the series anyway?  What's 
the difference?
Would doing the /INTO at the end of the series make it easier for 
the memory allocator to reclaim the space after the remove?
(May have misunderstand what you meant about the algorithm needing 
the original array and a temporary buffer the size of the output, 
as well as the output.)
Oh well, I dunno.  It's too hot here right now to think at all.  
92 indoors in the shade.
Ladislav
26-Sep-2010
[18499]
{See UNIQUE. This function is not widely used in my apps, just because 
of that. Useless, because when we deals with huge series, we don't 
want to pay the COPY cost.} - while this looks like a reasonable 
argument at the first sight, it actually isn't. The reason why is 
based on the way how UNIQUE is (and has to be) implemented. You cannot 
gain any speed worth that name by allowing an implementation doing 
the "UNIQUE job" in place.
Graham
26-Sep-2010
[18500x4]
Regarding the above, is there a way to reassign and take exisiting 
references with you?   Or, is this a lack of pointer manipulation?
I guess the argument about existing references then also applies 
to 'unique
Looks like the python FBP is http://www.kamaelia.org/Home.html
ooops ... wrong group
Fork
26-Sep-2010
[18504]
Ladislav/BrianH: so doesn't that imply that the UNIQUE/INTO variant 
would be more useful as a native than UNIQUE/NO-COPY?  (I'm not sure 
how a temporary state of a series of length 2N which has N discarded 
is worse than 2 series of size N in which one of those series is 
discarded.)
Ladislav
27-Sep-2010
[18505x2]
these reduce/into b b and unique/into b b make me wonder, whether 
the people suggesting it really find such crazy code useful?
(I never felt like using such things)
Fork
27-Sep-2010
[18507]
Rebol looks crazy to most people regardless.  To my midn, if  Rebol 
has so far avoided a convention like /NO-COPY on these other routines 
in favor of /INTO, then consistent and learnable novelty trumps some 
organic evolution of refinements and wording in the core.
Gregg
27-Sep-2010
[18508]
I've never had a need for reduce/into or unique/into. The optimizer 
in my brain says reduce/info could be useful, but the rest of my 
brain thinks the lure of possible optimizations will create internal 
strife with said optimizer. :-) I do have an /into refinment on my 
COLLECT func, and the standard COLLECT does as well, but my goal 
there was for convenience is building up results, not optimization.
Ladislav
27-Sep-2010
[18509]
but the most crazy is not that reduce/into b b "looks crazy", but 
that it is unneeded, and *very* inefficient, unless implemented using 
copying
Gregg
27-Sep-2010
[18510x2]
It doesn't make sense to reduce/into the same series, but that's 
not the intended purpose. Or am I missing something? Of course, people 
might still do it, thinking they're optimizing. :-\
I guess I just don't care enough about how hard I make the GC work.
Ladislav
27-Sep-2010
[18512]
Of course, people might still do it, thinking they're optimizing.
 - you nailed it
Geomol
27-Sep-2010
[18513]
Anton, I think, it's a "ground rule" in Carl's design of the language, 
that everything entered into the parser are datatypes (or series 
of datatypes). I can't think of anything with semantic meaning, that 
is not a datatype, when we talk REBOL. The language is recognized 
by it's minimalistic syntax. That's why I call it a "ground rule".


I think, it's legal to call REBOL a sequence of datatypes. It's maybe 
more precise than calling it a programming language (also because 
it's so different from traditional programming languages).


And then, yes, he has added newline markers to e.g. blocks. But they 
have no semantic consequence.
Anton
27-Sep-2010
[18514x8]
Geomol, I thought about it a bit more and realized what you said 
made good sense.
It makes me a bit sad that we haven't found a way to get what I wanted.
But oh yeah, I wrote this experimental function last night.
sforpath: func ["Evaluate a path similar to the builtin path evaluation, 
except number elements SELECT (rather than PICK)."
	path [path!] action [function! block!] /local e v c
][

 v: get path/1 ; The path is assumed to begin with a word, so get 
 its value.

 while [not tail? path: next path][ ; Step through the path to inspect 
 each of its elements.
		c: v ; Store the current value before SELECTing into it.
		e: pick path 1 ; The current element.
		;print [mold :e mold type? :e]
		if get-word? :e [e: get e]
		case [

   number? e [v: select v e] ; SELECT this number element. (Paths normally 
   PICK number elements.)

   word? e [v: select v e] ; SELECT this word element (as paths normally 
   do).
		]
	]
	;?? e ?? v ?? c
	; Process the last element.
	if block? :action [action: func [c e] action]
	action c e
]
; Test
values: [1 [dos [new 0]]]

sforpath 'values/1/dos/new [c/:e: c/:e + 1]  ; <- DideC's INC-COUNTER 
function could be implemented simply using this.
Just to remind, DideC's example had:

values: [

 1 [dos [new 0 modified 0 deleted 0] fic [new 0 modified 0 deleted 
 0]]

 2 [dos [new 0 modified 0 deleted 0] fic [new 0 modified 0 deleted 
 0]]
]
And here's an idea which may make the usage simpler:
Write a function USEPATH, used like this:

	values: [1 [dos [new 0]]]
	p: 'values/1/dos/new

 usepath p [p: p + 1]  ;=== (values/1/dos/new: values/1/dos/new + 
 1)
	values ;== [1 [dos [new 1]]
	
or perhaps, achieving the same result:

	usepath p 'v [v: v + 1]


where v is the value SELECTed by the last element in the path (so 
you can choose a different variable name).
  
or even:

	usepath p 4 'v [v: v + 1]


where 'v is set to refer to the value of the 4th element ('new) in 
the path.

so we can refer to any path element by its number, and so this would 
achieve the same:


 usepath p 3 'v [v/new: v/new + 1]  ;=== (values/1/dos/new: values/1/dos/new 
 + 1)
  
and even crazier:

	usepath p 'root/pid/fs/flag [flag: flag + 1]


where the second path elements (all word!s) reference the values 
actually SELECTed in p,
so these are all equivalent:
	usepath p 'values/pid/fs/flag [fs/dos: flag + 1]
	usepath p 'values/pid/fs/flag [fs/dos: fs/dos + 1]
	usepath p 'values/pid/fs/flag [root/pid/fs/flag: flag + 1]
All good fun.
(Oops, the last three examples should begin:   usepath p 'root/.... 
)
BrianH
27-Sep-2010
[18522x5]
Ladislav, REDUCE/into is used to eliminate a temporary series allocation 
in the common INSERT REDUCE and APPEND REDUCE patterns. It is used 
in mezzanine code, and does in fact reduce memory allocation overhead. 
Hate to break it to you, but REDUCE/into and COMPOSE/into were actually 
worth adding. Sorry.
REDUCE/into and COMPOSE/into cut down 2 copies to 1. They don't eliminate 
copies altogether.
The main reason that UNIQUE isn't used is because it is not very 
useful: It is easier to make sure that the series is unique in the 
first place. The other "set" functions like INTERSECT, DIFFERENCE 
and EXCLUDE are used more often, even though they do the same copies 
as UNIQUE, using the same hashing method.
I suggested REDUCE/into and COMPOSE/into, and found them to be immediately 
useful. I don't know why UNIQUE/into is being suggested, its use 
case.
Henrik had a use case for DEDUPLICATE, though that was when he thought 
he could eliminate the copy, and actually wanted to change UNIQUE 
to be modifying instead. It's not a common enough use case to make 
it into the core functions, but there is a mezzanine for it in the 
CureCode ticket.
Ladislav
27-Sep-2010
[18527x3]
REDUCE/into and COMPOSE/into were actually worth adding

 - no problem with that, what I was saying, though, was something 
 else
I specifically had objections against the

    reduce/into b b

expression, which cannot be done efficiently without copying
...neither it is really useful, in my opinion
BrianH
29-Sep-2010
[18530]
I definitely agree with that. I can't figure out a use for that code 
pattern, at least at first glance.
Maxim
30-Sep-2010
[18531]
when we generate blocks on the fly, we sometimes need evaluation 
to occur at a later stage.  this allows to save on an uneeded additional 
copy.  (which can be taxing on the GC when blocks are large)
Ladislav
1-Oct-2010
[18532]
this allows to save on an uneeded additional copy.
 - if you are referring to the

    reduce/into b b


expression, Max, then your assumption is wrong, this cannot save 
an unneeded additonal copy
Maxim
1-Oct-2010
[18533]
well IIRC Steeve had done tests and it was much faster... so something 
is being saved.
Izkata
1-Oct-2010
[18534]
it was so that new memory didn't need to be allocated in this expression, 
and the old didn't need to be garbage collected:

b: reduce b
Steeve
1-Oct-2010
[18535x3]
About the /into refinement.

I don't bother if internaly, a copy is done or not, while it's faster 
and memory safe (the block must be freed immediatly not by the GC) 

The GC used to have some memory leaks and was reacting too late to 
be useful (especialy in apps with a GUI).
I now it's better now (thanks to recycle/ballast).
* I know
about reduce/into b b

It allows to keep several references of the same block without the 
need to reassign them when the block is reconstructed.

But currently, we can't avoid the block expansion because the data 
are inserted instead of beeing replaced (which would be more useful 
to my mind).

Then, if internaly it's faster to do a copy, I don't care while it's 
not recycled by the GC (for the reasons I mentioned previously).
BrianH
1-Oct-2010
[18538x2]
Sounds interesting. I know that CHANGE/part is the real fundamental 
operation, but that needs an additional parameter and would require 
renaming the option - /into isn't an appropriate name for that operation. 
But I don't think it would have been implemented in that case, and 
having it be a direct CHANGE would definitely not have been accepted 
because /into is mostly used to replace chained INSERT and APPEND 
operations.
The downside of the /into option is that you have to be careful when 
using it or your code won't be task-safe. Your multiple references 
to a block that can change scenario is an example that can quite 
easily lead to task-unsafe code, for instance. But /into is great 
for micro-optimization and local code simplification, as long as 
you are careful to not modify shared structures without coordinating 
tasks.
Steeve
1-Oct-2010
[18540]
yeah indeed