r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3-OLD1]

BrianH
1-Sep-2006
[1292x6]
I'm trying out a new control flow technique that is really fun to 
use, and devilishly efficient.
As an example, here is delimit. I'll post conjoin soon.
delimit: func [
    "Put a value between the values in a series."
    data [series!] "The series to delimit"
    delimiter "The value to put into the series"
    /only "Inserts a series delimiter as a series."
    /copy "Change a copy of the series instead."
] [
    while either copy [
        copy: make data 2 * length? data
        either empty? data [[false]] [[
            copy: insert/only copy pick data 1
            not empty? data: next data
        ]]
    ] [
        copy: data
        [not empty? copy: next copy]
    ] either only [[
        copy: insert/only copy delimiter
    ]] [[
        copy: insert copy delimiter
    ]]
    head copy
]
Note that the options build the while statement without using compose, 
that no blocks are rebound, and that delimiter can be a thunk.
This is an excellent example of the kind of code that would be really 
difficult to compile, but is really fast to interpret. :)
It could be tightened up a little in the empty data case, so I'm 
going to do that and post it again when I post conjoin.
JaimeVargas
1-Sep-2006
[1298x4]
I would classify this into obscure Rebol. But it does the job ;)
Some benchmarking should be done to see how much speed it is actually 
gained, while trading for clarity.
BTW not using the /copy is not changing the series directly.
Sorry bad testing on my part.
BrianH
1-Sep-2006
[1302x6]
Well, it will be a little hard to check against your delimit since 
that will be more comparable to my conjoin. Still, building the while 
statement this way is no slower than calling one of several different 
while statements depending on options, and it is a lot less redundant. 
Plus it's fun.
Here's a slightly faster version:
delimit: func [
    "Put a value between the values in a series."
    data [series!] "The series to delimit"
    delimiter "The value to put into the series"
    /only "Inserts a series delimiter as a series."
    /copy "Change a copy of the series instead."
] [
    while either copy [
        if empty? data [return make data 0]
        copy: make data 2 * length? data
        [
            copy: insert/only copy first data
            not empty? data: next data
        ]
    ] [
        copy: data
        [not empty? copy: next copy]
    ] pick [
        [copy: insert/only copy delimiter]
        [copy: insert copy delimiter]
    ] only
    head copy
]
And here's a conjoin function:
conjoin: func [
    "Join the values in a block together with a delimiter."
    data [any-block!] "The series to join"
    delimiter "The value to put into the series"
    /only "Inserts a series delimiter as a series."
    /quoted "Puts string values in quotes."
    /local
] [
    if empty? data [return make data 0]

    local: either series? local: first data [copy local] [form local]

    while [not empty? data: next data] either any-string? local [pick 
    [

        [local: insert local reduce [delimiter {"} first data {"}]]
        [local: insert insert local delimiter first data]
    ] quoted] [pick [
        [local: insert insert/only local delimiter first data]
        [local: insert insert local delimiter first data]
    ] only]
    head local
]
For safety, change

    local: either series? local: first data [copy local] [form local]
to

    local: either series? local: first data [copy local] [form :local]
Anton
2-Sep-2006
[1308x3]
Brian, delimit function: 

- For long-term readability, I would avoid reusing 'copy as a variable. 
I suggest 'result, even if it means using another word.

- I understand with the /copy refinement you are able to get more 
speed in creating the result block, but I think I would prefer just 
letting the user copy the data before passing to delimit. This would 
give a simpler implementation, easier to read again.

I don't wish to devalue your effort in getting this version - I did 
a similar thing optimizing conjoin - made it harder to read.
I agree that conjoin should keep the type of series, as rejoin does, 
not just create strings.
Brian, I think Jaime was making the same point, as I do above, about 
speed vs clarity, with regards to /copy. Some benchmarking is needed 
comparing:

- delimit/copy data    ; <--- delimit with /copy refinement implemented
- delimit copy data    ; <--- delimit without /copy refinement
BrianH
5-Sep-2006
[1311]
I was just using the same refinement /copy that bind uses, but I 
agree that its reuse as a local variable isn't very readable. I should 
use /local like my conjoin does. Speaking of conjoin, what do you 
think of the one above? The only speedup I can see to do is to replace 
the insert reduce with four inserts, but otherwise it seems useful.
Anton
5-Sep-2006
[1312x3]
I'll have to check it out more after some sleep...
A function which joins values into a string might be named CONFORM.
Brian, I've read carefully through your conjoin (but haven't tested 
yet), and I like it, except for *one* thing - I would reverse the 
order of the data and delimiter arguments. (Actually, I'm searching 
now for a better word than "delimit". It doesn't quite seem right.)
Anton
6-Sep-2006
[1315x10]
Some synonyms:   abutment, articulation, bond, buttress, *coupling, 
junction, ligament, *pad, *prop, rafter, rib, *seam, splice, *splint, 
strut, *wedge, weld
(* = favourites)
I think my favourite is PAD.
Actually, I don't think the implied REDUCE issue has quite gone away 
for me. Who has never been annoyed by having to write REDUCE in expressions 
such as:

 switch event/key reduce [tab [print "tab key pressed"] #" " [print 
 "space"]  ; <-- here, 'tab would not reduce to #"^-" unless REDUCE 
 was used.
How many times have we written:
	select reduce ...
So I'm thinking of readding the /LITERAL refinement and using DO/NEXT 
when it is not used.
Brian, your last version of CONJOIN, a minor problem:

** Script Error: pick expected index argument of type: number logic 
pair
** Where: conjoin
** Near: pick [
    [local: insert local reduce [delimiter {"} first data {"}]]
    [local: insert insert local delimiter first ...
I think just use either again instead of pick.
Brian, if you are going to use a get-word for safety with the first 
value, ie.  [form :local] , then it's probably consistent to be safe 
with the rest of the data too ? eg. instead of
	insert local first data
use:
	insert local pick data 1
Forget that, FIRST is fine.
; Brian H's version corrected by Anton:
; - LOCAL starts at its tail
; - PICK converted to EITHER (PICK doesn't work with NONE)
; - /QUOTED applied to first value
conjoin: func [

 "Join the values in a block together with a delimiting PAD value."
	data [any-block!] "The series to join"
	pad "The value to put into the series"
	/only "Inserts a series PAD as a series."
	/quoted "Puts string values in quotes."
	/local ; <- used to track tail of the result as we build it
] [
	if empty? data [return make data 0]

 local: tail either series? local: first data [copy local] [form :local]

 if all [quoted any-string? local][local: insert tail insert head 
 local {"} {"}] ; quote the first value
	; <- (local should be at its tail at this point)
	while [not empty? data: next data] either any-string? local [
		either quoted [

   [local: insert insert insert insert local pad {"} first data {"}]
		][
			[local: insert insert local pad first data]
		]
	] [
		either only [
			[local: insert insert/only local pad first data]
		][
			[local: insert insert local pad first data]
		]
	]
	head local
]

; test
conjoin [] ""
conjoin [] ","
conjoin [1 2 3] '|
conjoin [[1] 2 3] '|
conjoin ["one" 2 3] ", "
conjoin [["one"] 2 3] '|
conjoin [1 2 [3]] [pad]
conjoin [[1] 2 [3]] [pad]
conjoin/only [[1] 2 [3]] [pad]
conjoin/only [[1] 2 [3]] 'pad

conjoin/quoted [1 2 3] '|
conjoin/quoted [[1] 2 3] '|
conjoin ["one" 2 3] ", "
conjoin [1 2 [3]] [pad]
conjoin/only [1 2 [3]] [pad]
conjoin/only [1 2 [3]] 'pad
; Anton's enhanced version:
; - /quote is applied to first value, if a string

; - reorders PAD and DATA arguments so PAD is first (being likely 
always short)
; - distinguishes /only and /pad-only
; - renames /quoted -> /quote
conjoin: func [

 "Join the values in a block together with a delimiting PAD value."
	pad "The value to put into the series"
	data [any-block!] "The series to join"
	/only "Inserts a series value in DATA as a series."

 /pad-only "Inserts a series PAD as a series." ; <-- this might not 
 be used much in practice (easy to add extra brackets around PAD)
	/quote "Puts string values in quotes."
	/local ; <- used to track tail of the result as we build it
] [
	if empty? data [return make data 0]

 local: tail either series? local: first data [copy local] [form :local]

 if all [quote any-string? local][local: insert tail insert head local 
 {"} {"}] ; quote the first value
	; <- (local should be at its tail at this point)
	while [not empty? data: next data] either any-string? local [
		either quote [

   [local: insert insert insert insert local pad {"} first data {"}]
		][
			[local: insert insert local pad first data]
		]
	] [
		either only [
			either pad-only [
				[local: insert/only insert/only local pad first data]
			][
				[local: insert/only insert local pad first data]
			]
		][
			either pad-only [
				[local: insert insert/only local pad first data]
			][
				[local: insert insert local pad first data]
			]
		]
	]
	head local
]

; test
conjoin "" []
conjoin "," []
conjoin '| [1 2 [3]]
conjoin '| [[1] 2 [3]]
conjoin ", " [{one} 2 [3]]
conjoin '| [["one"] 2 [3]]
conjoin/only '| [["one"] 2 [3]]

conjoin/only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no difference 
in string mode
conjoin/only [pad] [[1] 2 [3]]

conjoin/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no difference 
in string mode
conjoin/pad-only [pad] [[1] 2 [3]]

conjoin/only/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no 
difference in string mode
conjoin/only/pad-only [pad] [[1] 2 [3]]

conjoin/quote "" []
conjoin/quote "," []
conjoin/quote '| [1 2 [3]]
conjoin/quote '| [[1] 2 [3]] ; QUOTE doesn't work in block mode
conjoin/quote ", " [{one} 2 [3]]
conjoin/quote '| [["one"] 2 [3]]
conjoin/quote/only '| [["one"] 2 [3]]

conjoin/quote/only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no difference 
in string mode
conjoin/quote/only [pad] [[1] 2 [3]]

conjoin/quote/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make no 
difference in string mode
conjoin/quote/pad-only [pad] [[1] 2 [3]]

conjoin/quote/only/pad-only [pad] [1 2 [3]] ; ONLY and PAD-ONLY make 
no difference in string mode
conjoin/quote/only/pad-only [pad] [[1] 2 [3]]
BrianH
6-Sep-2006
[1325x2]
Anton, I put the data argument first on purpose, to make the function 
fit in with the standard function argument order of series functions 
in REBOL.
I'll look at the rest later. Good catch on the pick.
Anton
7-Sep-2006
[1327x2]
Years ago, I successfully argued to Carl that SWITCH's VALUE argument 
should go before the CASES argument. My reasoning today is the same 
- it is easier to parse visually when the smaller or less frequently 
changing parts of an expression go together. As you can see above, 
all the conjoins with the same PAD argument are easy to see, and 
the more likely to vary DATA blocks begin sometimes at the same horizontal 
position (thus, easier to compare). Just scroll up and compare with 
the tests for your version; look at each line and try to see what 
the differences between them are.

The reasoning that a standard argument order is a good memory guide 
isn't strong enough for me; there is always HELP, and I think the 
particularities of each function are more important when determining 
the order of arguments.
Anyway, I knew I would encounter some resistence to the argument 
order in this version. The argument order is less important than 
all the other features, even though I feel strongly about it. If 
I have to reverse argument order to get it through, I will. (But 
I will try to make you rebut my argument first.) I keenly await your 
analysis of both functions. Maybe there are some cases I haven't 
considered?
Ladislav
7-Sep-2006
[1329]
What are your preferences for: abs -9223372036854775808 (64-bit integer) 
or abs -2147483648x-2147483648 (32-bit integers)? As far as I found 
out it looks, that in Python as well as in Java the C "standard" 
yielding the negative numbers is used.
Anton
7-Sep-2006
[1330]
I think it should be an overflow error.
Ladislav
7-Sep-2006
[1331]
Could somebody confirm my guess, about the Python an Java behaviour?
Volker
7-Sep-2006
[1332x2]
Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
>>> abs( -9223372036854775808)
9223372036854775808L
whatever "L" is..
Pekr
7-Sep-2006
[1334]
:-)
Volker
7-Sep-2006
[1335x7]
public class TestAbs {
	public static void main(String[] args) {
		System.out.println(Math.abs(-9223372036854775808));
	}
}
-9223372036854775808
Without range-check cbehavior is typical with two-complement. 0 is 
"positive", so there is one positive number less. Interesting that 
python can handle it.
seems python uses bignums for long:
>>> abs( -9223372036854775808 ** 2)
85070591730234615865843651857942052864L
http://docs.python.org/lib/typesnumeric.html
BTW i would rething the name for 'decimal! . To me its base-10. float 
or such are better for floats IMHO. If that does not break to much, 
but should be a global replace.
Re argument-oder: To me big inline block comes last, vars first. 
Else the standard, the important thing first. With conjoin i am unsure, 
it looks to me as if it rarely has inline-data. If i pad things together, 
i usually have a list, 
  conjoin list-of-things  ","

Its not like 'reduce or 'rejoin, where i mix inline-data with variables, 
which can span some codelines.
If i am wrongand its used like
  cojoin "," ["I" "who writes this" "has more to think about it"]
i am with Anton, small thing first.