r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Pekr
9-Feb-2012
[2794]
REBOL parse is a gem, a treasure to follow. Me, the coding lamer, 
did few things using it. Guys coding C++ first came meh, well, interpreter. 
Then  - how is it possible it is faster than C++ app? Later on, they 
came with new requests asking - well, you know, you have that parser, 
we need to do following stuff ...
ddharing
9-Feb-2012
[2795]
Well said, Oldes.
james_nak
9-Feb-2012
[2796]
Guys, with all this said (and I agree), perhaps this is the one things 
that needs to be the focal point for Rebol and eventually the #Not 
Rebol languages.  I know there are some tutorials out there but do 
any of them do justice to parse? I keep going back to the Codeconscious 
one: http://www.codeconscious.com/rebol/parse-tutorial.htmland 
the ones at reboltutorial, but there doesn't seem to be a lot considering 
how much one can do with it.
Maxim
9-Feb-2012
[2797]
I learnt parse using the 2.3 rebol core guide...  I thought it did 
a pretty good job of launching one in the good direction.   parse 
HAS evolved since then, but for the basic semantics and principles 
of parsing I think its pretty good.

you can also look at this tutorial by Nick Antonaccio:
http://musiclessonz.com/rebol_tutorial.html#section-9.3


IIRC nick has a good sense of tutoring, so it may be a good first 
step... he also gives links to other parse resources at the end of 
that part of his (short) tutorial
Pekr
9-Feb-2012
[2798]
Max - are you using R2 parse, or R3 enhanced one?
Maxim
9-Feb-2012
[2799]
R2.   


since we compile just about all the rules from other datasets and 
simplified user-data, the R3 advantage is much less significant (because 
we can simulate all the R3 improvements by using R2 idoms, though 
its sometimes tricky).


Using R3, it probably would be a few percent faster since some of 
the rules we have would be simpler and those tricks would be managed 
natively by parse rather than by *more* parse rules.
james_nak
9-Feb-2012
[2800]
Thanks Maxim. I appreciate the info.
Maxim
9-Feb-2012
[2801x2]
The problem with R3 right now is that it isn't yet compiled in 64-bits 
we still have the 1.6GB RAM limit for a process which is the biggest 
issue right now.   I have blown that limit a few times already, so 
it makes things a bit more complex and it doesn't allow me to fully 
optimize speed by using more pre-generated tables and unfolded state 
rules.
Our datasets are huge and we optimise for performance by unfolding 
and indexing a lot of stuff into rules... for example instead of 
parsing by a list of words, I parse by a hierarchical tree of characters. 
 its much faster since the speed is linear to the length of the word 
instead of to the number of items in the table. i.e.  the typical 
 O*n   vs.   O*O*n  type of scenario .  just switching to parse already 
was 10 times faster than using  hash! tables and using find on them.... 


In the end, we had a 100 time speed improvement from before parse 
to compiled parse datasets.  this means going from 30 minutes to 
less than 20 seconds....but this comes at a huge cost in RAM... a 
400MB Overhead to be precise.
ddharing
9-Feb-2012
[2803x2]
Memory is cheap. It's the 32-bit limit that is the real problem -- 
as you stated.
I'm confused. Why is REBOL limited to 1.6GB? I've seen that myself 
too, but that is nowhere near the 4GB limit.
Maxim
9-Feb-2012
[2805x3]
yeah...  I've got a server that has 64GB of RAM  I want to use it 
 !!!   :-)
its the MS windows limit.   it can only address 1.6GB of memory in 
32-bit mode.
it may be higher on linux, I've never tested it.
ddharing
9-Feb-2012
[2808]
I see. What about Linux?
Maxim
9-Feb-2012
[2809x2]
(btw that 1.6GB limit used to be a real problem when I was doing 
3D stuff...  3D animation apps are memory hogs, and in some cases, 
we could only work 15 minutes before high-end apps would crash.

which is a problem when a 3D scene takes 30 minutes to save to disk 
over the network  ;-)
can anyone explain a single use for this R2 path conversion?

>> to-string first [path/item]
== "pathitem"


I know I can use mold... it's just that I wonder why to-string doesn't 
use the molded string equivalent as well?
Oldes
9-Feb-2012
[2811]
funny.. I was thinking about it today as well.. but I don't know
Sunanda
9-Feb-2012
[2812]
Something inconsistent in the way paths are handled:
    to-string load "path/item"
    == "pathitem"
     to-string to-path "path/item"
     "path/item"
Steeve
9-Feb-2012
[2813]
You can use FORM as well.

And having alternatives should not be something to complain about. 
:)
Maxim
9-Feb-2012
[2814]
I'm not complaining, I just find absolutely no use-case for the default 
  :-D


my question was can anyone give me a reason for the current use of 
to-string?  

can you?  ;-P
Ladislav
9-Feb-2012
[2815x2]
What is "O*O*n"?
Use case for default:

to-string [1 "." 2] ; == "1.2"
Maxim
9-Feb-2012
[2817x2]
O*O*n
  == a typo  :-)

I guess I really meant  something like O(n*n) 


Its the kind of dramatic  linear vs logarithmic scaling difference 
when we unfold our datasets into parse.


but its not exactly that kind of scaling, since the average topology 
of the sort tree will have a lot of impact on the end-result.  for 
example in my system, when I try to index more than the first 5 characters, 
the speed gain is so insignificant that the ratio is quickly skewed, 
when compared to the difference which the first 3 letters give.  


Its 100% related to the actual dataset I use.  in some, going past 
2 is already almost useless, in others I will have to go beyond 5 
for sure.  in some other datasets we unfold them using hand-picked 
algorythms per branch of data, and others its a pure, brute force 
huge RAM gobler.
I always love when I realize that I write things like this in Rebol:

-*&*&*&*-:  "a pretty impossible to guess variable name :-)"
Steeve
9-Feb-2012
[2819x2]
Max, although I think you're comparing O(1) vs O(n) parsing algorithms 
(random access vs linear)

(The indexing part is probably meant to be O(n.log n) because it 
involves sorting data, but should be taken apart from the parsing 
cost)

just wandering around, uhuh
Anyway O(n*n) is by far too dramatic ;-)
Pekr
10-Feb-2012
[2821x2]
where should I put DLLs, in order for REBOL to find them? I mean 
- I have one DLL, which is dependant on some other. Even if I put 
that DLL into the same directory, it complaints it can't find it. 
Win Vista here ...
or should I register them somehow using regsvr or something like 
that?
Oldes
10-Feb-2012
[2823]
I don't know how it's on Vista, but on W7 or XP you can place it 
anywhere... I today updated my old zlib script to do late initialisation, 
you can find it here: https://github.com/Oldes/rs/tree/88291b8c720e9026978a080ca40100c3f2fb780f/projects-dll/zlib/latest
Endo
10-Feb-2012
[2824]
Pekr: Registration (regsvr) is required only if they are ActiveX 
DLLs, but I think they are not because you cannot use ActiveX DLLs 
in REBOL.

Normally they should be somewhere in your PATH. Try to see what's 
happening with FileMon tool from Systeminternals.com.
Maxim
10-Feb-2012
[2825]
it also looks in the current-dir... but that path will depend of 
how you launched rebol.


use WHAT-DIR just before you try to load your dll  to know where 
the current-dir is at that time and put your dll there.


you can also add a path in the user or system path environment and 
place the dll there.
Pekr
11-Feb-2012
[2826x3]
I'll continue here for now, as /library is now a free part of Core, 
and DLL.SO is not web-public.
My observation is, that if there are one or more dependant DLLs, 
REBOL will load first one, but then the path is somehow not taking 
into account a present directory. Here's few pointns:

- you can't do: do %my-dir/my-dll-script.r
- nor you can do so after: change-dir 


But it works, when you launch REBOL from the directory where those 
DLLs are present.
There is several various paths in R2 structure, dunno if it is just 
weird R2 implementation, or OS level natural functionality ...
PeterWood
11-Feb-2012
[2829]
/library is not a free part of Core only View.
Geomol
17-Feb-2012
[2830x3]
If datatypes equals words, like word! = 'word!, then maybe the refinement 
in type?/word isn't needed? But what are the consequences? The next 
two examples would return the same:

>> find [integer! 42] integer!
== [42]
>> find [integer! 42] 'integer!
== [integer! 42]


I came to think of this, because I find myself writing things like 
the following all the time now:

	either find [block! paren!] type?/word value [ ...
and
	switch type?/word value [ ...


If datatypes equals words, only type? without the refinement would 
be needed.
I know, I today can write things like


 either find [#[datatype! block!] #[datatype! paren!]] type? value 
 [ ...


but I don't do that, because it has too much syntax for my taste, 
and therefore isn't very readable.
Maybe the question should be put the other way around: Are there 
cases (in real scripts), where it would be a problem, if datatypes 
equals words?
Ladislav
17-Feb-2012
[2833]
FYI - datatypes were never words, but they were examples of specific 
datatypes in R1
Geomol
17-Feb-2012
[2834x2]
refinement! is member of the any-word! typeset together with word!, 
set-word!, get-word! and lit-word!. My thoughts above lead to asking 
if also none! and logic! should be part of any-word! with datatype! 
too? Examples from R2:

>> /ref = 'ref
== true
>> find [/ref]Ê'ref
== none		; this is strange to me

Maybe all the next should succeed?

>> find [true] true
== none
>> find [none] none
== none
>> find [integer!] integer!
== none
It's funny, that the following succeed, but for another reason:

>> find [word!] word!
== [word!]
Andreas
17-Feb-2012
[2836x2]
none! and logic! are simply not word types, so it makes no sense 
to have them in the any-word! typeset. none/true/false being words 
conveniently pre-bound to values of the corresponding datatypes does 
not change that.
Note that we also have a literal syntax for none! and logic! values 
now, which makes all your finds succeed even without reducing:

>> find [#[true]] true
== [true]

Etc.
Geomol
17-Feb-2012
[2838x2]
Integers are not decimals, but they're both numbers, and we can check 
like:

>> 1 = 1.0
== true

Refinement are not words, but they're both any-words.


Why not let datatypes (and none and logic) be any-words just the 
same? If the benefit from doing so is big, then it could be a good 
idea.
w> /refinement = 'refinement
== true

The question is, if the following would lead to a disaster?

w> integer! = 'integer!
== true
Andreas
17-Feb-2012
[2840]
In any case, I wouldn't conflate that question with the question 
of #[true] = 'true.
Gregg
17-Feb-2012
[2841]
There is a big difference between having datatypes be word values, 
versus having them fall under the any-word pseudotype. The latter 
seems OK, but not the former. If I understand you, it would cause 
things like [datatype? integer!] to fail, because it would return 
word!. That is, we lose them as an identifiable datatype. I use them 
for reflective tools and dialects. While the change wouldn't make 
that impossible, I like them being first class citizens as they are 
today.
Geomol
17-Feb-2012
[2842x2]
No, let me clarify. I want integer! to represent a datatype, like 
1 represents an integer. So datatype? integer! should return true, 
and word? integer! should return false, just like decimal? 1 returns 
false.


I simple suggest equal? to return true, when comparing a datatype 
with a word of the same spelling. Like this is true:

>> equal? 1 1.0
== true
Technical speaking, it's an expanded coercion for the equal operator, 
=, (and so also for the equal? function).