r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Rebol School] Rebol School

Steeve
3-Jan-2009
[1212]
(no need of to integer!)
BrianH
3-Jan-2009
[1213x9]
I think the original, functional-style FOLD would be more useful 
for REBOL, as the REBOL-style one has been covered by foreach. Perhaps 
it should be done as a function that takes a block and words and 
returns a function though. I am not familiar with the Haskel equivalent, 
but I'd be surprised if it doesn't return a function.
Here's a version of MAP with the /into semantics and no variable 
capture problems. I had to add a local function, though it is only 
created once per call. Binding issues prevent the use of a literal 
function. While i was at it I added unset! handling like the R3 version.

map: func [
    [throw]

    "Evaluates a block for each value(s) in a series and returns them 
    as a block."

    'word [word! block!] "Word or block of words to set each time (local)"
    data [block!] "The series to traverse"
    body [block!] "Block to evaluate each time"
    /into "Collect into a given series, rather than a new block"
    output [series!] "The series to output to"
] [

    unless into [output: make block! either word? word [length? data] 
    [divide length? data length? word]]
    foreach :word data reduce [func [val [any-type!]] [
        if value? 'val [output: insert output :val]
    ] to paren! body]
    either into [:output] [head :output]
]
Note that the tail is returned with /into, for chaining purposes.
I forgot an /only though - the insert is supposed to be insert/only 
for the right behavior.
I forget /only more than I would like :(
Note that the copy performed by the FOREACH is a bind/copy, not a 
copy/deep. This means that only any-words, blocks and parens are 
copied, all other types are just referenced, including the function.
I think path! types are also copied by bind/copy.
Definitely not list! types.
Janko, the REBOL equivalent of Haskell's filter is REMOVE-EACH.
Janko
3-Jan-2009
[1222]
I read where I left off yesterday, very interesting chat, BrianH, 
thanks for info on remove-each
BrianH
3-Jan-2009
[1223]
I added the last MAP with the insert/only fix to DevBase. We'll see 
if it gets accepted. All we're missing now is fold.
Janko
3-Jan-2009
[1224]
great! :)
Steeve
3-Jan-2009
[1225x2]
just a thing Brian...

i don't like how map evolved. It lost his simplicity and inner speed.
Some gain like (either vs to-block) have been over rated.

some other bringing major speed regression  have been under rated.


i prefer the throw of an error during initialisation (ie. if find 
word  'output) instead of using the tricks of the embedded builded 
function.
perhaps we should have 2 distinct foreach block or 2 distinct functions 
(map and map-into)
BrianH
4-Jan-2009
[1227x4]
We would have had to add the function anyway, for the the unset! 
screening we need for compatibility. And using the word 'output is 
not an error, so treating it as an error is inappropriate.
This was a backport mezzanine, and for backports the compatibility 
with R3 is key. You have to do some extra work when writing mezzanine 
functions that you don't have to do with one-off functions.
The /into option only ended up adding the overhead of 2 compares 
(the unless and either, no overhead in the foreach), and one head 
when taken. The rest of the added code either reduced overhead (preallocation 
of the output), dealt with errors (like treating any word as an error), 
or handled compatibility fixes (the if value? stuff).
What is the major speed regression?
Gregg
4-Jan-2009
[1231]
I don't think it's good that /INTO changes the result to return the 
tail.
BrianH
4-Jan-2009
[1232x5]
I am still debating that. It doesn't return the tail, actually, it 
returns the point after the insert, which if not the tail you otherwise 
can't know from the outside. The /into option is meant to be usable 
for chaining. You already have the point of insertion in the form 
of the reference you passed. You can get the tail and head easily. 
The only bit of info you can't get is the position after the insertion.
The /into option is not just for MAP. It is intended to be added 
to a lot of series generating functions. The point of it is to allow 
those functions to optionally use INSERT semantics, so that we can 
make it easier to use buffers in a lot of places for memory savings. 
It's part of an overall strategy to reduce the memory overhead of 
REBOL code. INSERT semantics were chosen because INSERT is the most 
basic function here. CHANGE and APPEND can be implemented with INSERT 
(and REMOVE, HEAD and TAIL), but not so easily the other way.
If you are going to retrofit one series function onto all of the 
series generation functions, INSERT is the most general.
You would tend to use FOO/into using a different usage pattern than 
FOO.
The real benefits come from adding the /into option to REDUCE, COMPOSE, 
MOLD and FORM. We might add LOAD/into as well, but it remains to 
be seen whether it provides enough benefit there.
Gregg
4-Jan-2009
[1237x2]
I understand. I just think it's bad design to have the same function 
return a different location in the series based on a refinement. 
 Whether they return the head, tail, or point of change, consistency 
is the thing.
Unless, of course, the refinement is called /TAIL or something. :-)
BrianH
4-Jan-2009
[1239x4]
It doesn't return the tail.
Strangely enough, the consistency here will be consistency of the 
/into option across multiple functions.
The /into option follows INSERT semantics. It doesn't return the 
tail, it returns the point after the insert, for chaining.
You already have the start of the insert, and you can get the head 
and tail.
Gregg
4-Jan-2009
[1243]
Again, I understand. And it may be that having /INTO available everywhere--with 
consistent behavior--will not be a problem. Carl's call.
BrianH
4-Jan-2009
[1244]
Carl called for this already - that's why I'm doing it :)
Gregg
4-Jan-2009
[1245]
My point is that it's often hard to remember subtle details like 
this, and it can screw you up.
BrianH
4-Jan-2009
[1246x2]
Well, if it works consistently for a dozen functions, you'll find 
it easier to remember :)
And then you can just say "use /into" and you will know how it works, 
regardless of the function.
Graham
4-Jan-2009
[1248x2]
Looks like this group has been hijacked!
Ok, I am going to ask my question too.   I have to run a report where 
I collect data from a number of different functions.  Each of the 
functions runs asynchronously.  So, one might return data before 
another.  Not that the order matters.  But the user can select from 
1 to say 6 data functions/sources.

Now since these functions are async, I have to use callbacks to deal 
with the data once it arrives.
What would be the best way of programming this?  

At the end of this, I then need to do something with the collected 
data  .... ie. generate a graph.
Steeve
4-Jan-2009
[1250]
Brian, your last function is 2 time slower (on big series) just to 
avoid the collision with the 'result var and because you don't want 
2 distinct foreach blocks (one for /into, one for default).

You also forget to pre-insert void spaces when using the /into option, 
so that your implementation is incoherent with your initial statement 
of the need to avoid memory overhead

Actually, It's your choices (not the best ones to my mind) so i don't 
follow you. 

Finally, i don't see the interest to have not the fastest implementation 
of map in R2 just to have a strict retrocompatibilty with R3.
Graham
4-Jan-2009
[1251x2]
Each of those functions I am referring to has their own parameter 
list.
I could construct a block of functions, a block of parameters, and 
a data variable . and feed them to the first function, gradually 
removing them from the blocks and passing them in the callbacks?
Steeve
4-Jan-2009
[1253]
Graham, your functions could append their results in a global result 
block, so that you just need to loop and wait into that one ?
Graham
4-Jan-2009
[1254x4]
What happens if the functions fail?  I could end up waiting for a 
while ....
Ordinarily if I had a couple such functions, I would just call one 
in the callback of the other.
Just that here I don't know how many functions I need to call in 
advance.
OTOH, I have done what you have suggested before .. which is basically 
turning an async function into a synchronous one by waiting till 
it finishes by use of a flag of some type.
BrianH
4-Jan-2009
[1258]
Steeve, the function was not added for the /innto option, it was 
added for unset! skipping, which needs to be done all the time.
Steeve
4-Jan-2009
[1259]
the result stack could receive some messages too, to know the current 
status (your flag), you could throw errors in the result stack too
BrianH
4-Jan-2009
[1260]
Good idea about preallocating for the /into option though.,
Steeve
4-Jan-2009
[1261]
Brian, is the R3 map function dealing with unset! values too ?