Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Limitation coming from the "initialize" refinement used withthe "Arr

From: joel::neely::fedex::com at: 27-Jun-2002 7:46

Hi, Gerard, Gerard Cote wrote:
> Hello, > > As I tried to generate a dynamic line of code to create a way > to circumvent the way REBOL interprets the ARRAY notation when > used with variable indexes instead of numeric constants... >
Having beat my head against that particular brick wall, may I suggest living with it, instead of circumventing it! ;-) See samples below, which show a much faster (and easier to read IMHO) version. I'll address the initialization issue at the end.
> ...(nothing really hard with REBOL), I found some limitation with > the initialize refinement as following : It seems only possible > to use the same constant (a scalar value or any other datatype > seems to be is accepted) as the initial value used by the Array > word itself. > > Here is my example code ( I tried it directly at the concole) : > > tab_nbr: array/initial [3 2] 0 > L: 2 > C: 1 > tab_nbr/2/1: 10 > print tab_nbr/:L/:C > > tab_nbr/:L/:C: 20 >
As you observed, this is not valid REBOL syntax...
> do join join join join "tab_nbr/" L join "/" C ": " 20 >
...and that is entirely too complicated (especially for a language that is supposed to be designed for humans instead of programmers! ;-) By adding one level of indirection, we can revive the notion of place that allows (almost!) simple syntax again, and does wonders for performance. Let's build a little function that times the use of the above scheme: 8<---- table0: array/initial [3 2] 0 tally0: func [n [integer!] /local t row col] [ t: now/time/precise table0: array/initial [3 2] 0 row: col: 1 loop n [ do join join join join join "table0/" row "/" col ": " 1 + table0/:row/:col row: row // 3 + 1 col: col // 2 + 1 ] t: now/time/precise - t print mold table0 print t ] 8<---- The little trick with ROW and COL inside the loop just makes them sweep through the collection of paired values 1,1 2,2 3,1 1,2 2,1 3,2 so that all combinations occur equally frequently (and there's a minimum of index manipulation overhead relative to the actual work of incrementing the array value at the specified row and column, which is what we really want to time). Notice that all of the occurrences of JOIN can come at the front of that expression, since you're just constructing a string from a collection of values that can be stated one after the other. That said, this cries out for a single REJOIN expression (only the inside of the loop changes): 8<---- tally0a: func [n [integer!] /local t row col] [ t: now/time/precise table0: array/initial [3 2] 0 row: col: 1 loop n [ do rejoin [ "table0/" row "/" col ": " 1 + table0/:row/:col + 1 ] row: row // 3 + 1 col: col // 2 + 1 ] t: now/time/precise - t print mold table0 print t ] 8<---- Even so, this requires a great deal of block and string flogging; a new block is constructed per pass through the loop, which is then turned into a string, which is LOADed and then DOne. We can get better performance by flushing all of that string flogging: 8<---- tally0b: func [n [integer!] /local t row col path] [ t: now/time/precise table0: array/initial [3 2] 0 row: col: 1 loop n [ path: compose [table0 (row) (col)] do compose [(to-set-path path) 1 + (to-path path)] row: row // 3 + 1 col: col // 2 + 1 ] t: now/time/precise - t print mold table0 print t ] 8<---- Finally (the end is in sight! ;-) let's get back to the suggestion about adding a level of indirection. As you know, we can't alter a number in REBOL, but if we know "where" it is we can get to that place and put a different number "there". Instead of
>> array/initial [3 2] 0
== [[0 0] [0 0] [0 0]] (which give us a 3-by-2 structure of integers) let's build the pseudo-array as
>> array/initial [3 2 1] 0
== [[[0] [0]] [[0] [0]] [[0] [0]]] so that we now have a 3-by-2 structure of *containers* whose content can be accessed and replaced easily. This strategy gives us the following design for our tallying test: 8<---- table1: array/initial [3 2 1] 0 tally1: func [n [integer!] /local t row col] [ t: now/time/precise table1: array/initial [3 2 1] 0 row: col: 1 loop n [ change table1/:row/:col 1 + table1/:row/:col/1 row: row // 3 + 1 col: col // 2 + 1 ] t: now/time/precise - t print mold table1 print t ] 8<---- Finally, for comparison purposes, here's one that just does the looping and index management, without any "array" manipulation. It's time provides a baseline to be deducted from all of the other versions if we really want fair comparisons: 8<---- tallyx: func [n [integer!] /local t row col] [ t: now/time/precise row: col: 1 loop n [ row: row // 3 + 1 col: col // 2 + 1 ] t: now/time/precise - t print mold table1 print t ] 8<---- The punch line of all this is the times; using N of 120000 we get [[20000 20000] [20000 20000] [20000 20000]] for the TALLY0* series and [[[20000] [20000]] [[20000] [20000]] [[20000] [20000]]] for TALLY1. The times for two runs each are: tally0 0:01:03.88 0:01:04.09 tally0a 0:00:35.26 0:00:35.15 tally0b 0:00:17.68 0:00:17.52 tally1 0:00:06.27 0:00:06.09 tallyx 0:00:02.03 0:00:02.03 Averaging the two times for each, deducting the baseline from TALLYX, and normalizing to TALLY1, we get relative times of: tally0 14.9 tally0a 8.0 tally0b 3.8 tally1 1.0 (rounded to 1 fractional digit -- we shouldn't assume more precision with such short run times). In *very* rough terms, replace the run of JOINs with a single REJOIN cuts out about half of the time, going to a COMPOSEd block instead of a constructed string cuts about half of the remaining time, and going to an additional level of structure cuts a third to a fourth of the remaining time.
> But now that I can use real variable indexes with my array, am I > supposed to use loops too just to get any cell value initialized > with something other than some constant like the series of values: > 10, 20 , 30 , 40 50 and 60. >
Yes. But it can be done simply, in several ways (keeping with the three-level structure for sake of illustration). One brute-force approach would be to create the series of numbers and copy chunks into the table: rawdata: copy [] for i 10 60 10 [append/only rawdata reduce [i]] table: copy [] loop 3 [append/only table copy/part rawdata 2 rawdata: skip rawdata 2] table which gives == [[[10] [20]] [[30] [40]] [[50] [60]]] and several variations thereof. But let's take advantage of REBOL's nice management of containers (blocks) instead: table: array/initial [3 2 1] 0 counter: 0 foreach row table [foreach col row [change col counter: counter + 10]] table which also gives == [[[10] [20]] [[30] [40]] [[50] [60]]] without so much effort.
> Would it not be simpler to have something like this : > > tab_nbr: array/initial/series [3 2] 10 60 10 where the start, > stop and increment values would be respectively 10 60 and 10. >
Using either of the above strategies, you could write such a thing yourself quite easily. (Notice the redundancy between saying that you have [3 2] for the shape of the structure and saying that the initialization data runs from 10 to 60 in steps of 10. I'll drop that redundancy in the sample.) array-series: func [ dims [block!] lo [integer!] by [integer!] /local stripe ][ lo: lo - by do stripe: func [blk [block!]] [ either 1 = length? blk [ change blk lo: lo + by ][ foreach subblk blk [stripe subblk] ] blk ] array/initial append dims 1 0 ] which behaves as:
>> array-series [3 2] 10 10
== [[[10] [20]] [[30] [40]] [[50] [60]]]
>> array-series [2 3 2] 5 5
== [[[[5] [10]] [[15] [20]] [[25] [30]]] [[[35] [40]] [[45] [50]] [[55] [60]]]] This is such a special-case initialization, and so easy to write, that I suggest it should be a user-written function, rather than being added to the core language. Hope this helps! -jn- -- ; Joel Neely joeldotneelyatfedexdotcom REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip do function [s] [t] [ t: "" foreach [a b] s [repend t [b a]] t ] { | e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]