World: r3wp

Join the discussions in the REBOL3 world...

[Core] Discuss core issues

older newer	first last
Maxim 7-Feb-2012 [2783x4]	I've been working a lot lately, and haven't had a lot of spare time. I'm actually working with REBOL full time at a company which is using it to get a significant competitive advantage over the competition.
	(eek that was redundant, sorry ;-)
	I think people don't realize just how much power lies in parse. Even I'm impressed with it right now. I've been doing tests with really crazy stuff like two-cursor parse rules and run-time auto-recompilation of 400MB parse rules. I've been doing things like parsing 100MB word documents and pushing the interpreter to the limit ... reaching the 32-bit 1.6 GB RAM limit, 6 hour loop tests, etc. :-)
	in my last test I was doing natural language extraction of concepts at a rate of 25000 words a second within multi-megabyte text files. :-)
GrahamC 8-Feb-2012 [2787]	Who is this guy?
Maxim 8-Feb-2012 [2788]	hum... either that was sarcasm or you mean, what is the company I now work for?
GrahamC 8-Feb-2012 [2789]	Not sarcasm .. just humor :)
Maxim 8-Feb-2012 [2790]	hehe, I wasn't sure ;-)
james_nak 8-Feb-2012 [2791]	That's incredible Maxim. Good work. With what you do with parse, is the knowledge available online in tthe form of the present parse documentation, or did you have to discover new techniques? I have to admit I just barely use it when I need to. Anyway, thanks for sharing your experience. I
Maxim 8-Feb-2012 [2792]	learning parse requires baby steps and at some point, the decision to solve a real problem with it and force yourself to learn it. I didn't use parse for almost a decade until I started using it more and more to a point that currently I do more parse than any other coding in REBOL (but that's just because its idealy suited for this). some little tricks accumulate with experience and eventually, we discover pretty wacky things, which allow us to use parse almost like a VM.
Oldes 8-Feb-2012 [2793]	Parse is REBOL's heart... I cannot imagine living without it.
Pekr 9-Feb-2012 [2794]	REBOL parse is a gem, a treasure to follow. Me, the coding lamer, did few things using it. Guys coding C++ first came meh, well, interpreter. Then - how is it possible it is faster than C++ app? Later on, they came with new requests asking - well, you know, you have that parser, we need to do following stuff ...
ddharing 9-Feb-2012 [2795]	Well said, Oldes.
james_nak 9-Feb-2012 [2796]	Guys, with all this said (and I agree), perhaps this is the one things that needs to be the focal point for Rebol and eventually the #Not Rebol languages. I know there are some tutorials out there but do any of them do justice to parse? I keep going back to the Codeconscious one: http://www.codeconscious.com/rebol/parse-tutorial.htmland the ones at reboltutorial, but there doesn't seem to be a lot considering how much one can do with it.
Maxim 9-Feb-2012 [2797]	I learnt parse using the 2.3 rebol core guide... I thought it did a pretty good job of launching one in the good direction. parse HAS evolved since then, but for the basic semantics and principles of parsing I think its pretty good. you can also look at this tutorial by Nick Antonaccio: http://musiclessonz.com/rebol_tutorial.html#section-9.3 IIRC nick has a good sense of tutoring, so it may be a good first step... he also gives links to other parse resources at the end of that part of his (short) tutorial
Pekr 9-Feb-2012 [2798]	Max - are you using R2 parse, or R3 enhanced one?
Maxim 9-Feb-2012 [2799]	R2. since we compile just about all the rules from other datasets and simplified user-data, the R3 advantage is much less significant (because we can simulate all the R3 improvements by using R2 idoms, though its sometimes tricky). Using R3, it probably would be a few percent faster since some of the rules we have would be simpler and those tricks would be managed natively by parse rather than by more parse rules.
james_nak 9-Feb-2012 [2800]	Thanks Maxim. I appreciate the info.
Maxim 9-Feb-2012 [2801x2]	The problem with R3 right now is that it isn't yet compiled in 64-bits we still have the 1.6GB RAM limit for a process which is the biggest issue right now. I have blown that limit a few times already, so it makes things a bit more complex and it doesn't allow me to fully optimize speed by using more pre-generated tables and unfolded state rules.
Maxim 9-Feb-2012 [2801x2]	Our datasets are huge and we optimise for performance by unfolding and indexing a lot of stuff into rules... for example instead of parsing by a list of words, I parse by a hierarchical tree of characters. its much faster since the speed is linear to the length of the word instead of to the number of items in the table. i.e. the typical On vs. OO*n type of scenario . just switching to parse already was 10 times faster than using hash! tables and using find on them.... In the end, we had a 100 time speed improvement from before parse to compiled parse datasets. this means going from 30 minutes to less than 20 seconds....but this comes at a huge cost in RAM... a 400MB Overhead to be precise.
ddharing 9-Feb-2012 [2803x2]	Memory is cheap. It's the 32-bit limit that is the real problem -- as you stated.
ddharing 9-Feb-2012 [2803x2]	I'm confused. Why is REBOL limited to 1.6GB? I've seen that myself too, but that is nowhere near the 4GB limit.
Maxim 9-Feb-2012 [2805x3]	yeah... I've got a server that has 64GB of RAM I want to use it !!! :-)
	its the MS windows limit. it can only address 1.6GB of memory in 32-bit mode.
	it may be higher on linux, I've never tested it.
ddharing 9-Feb-2012 [2808]	I see. What about Linux?
Maxim 9-Feb-2012 [2809x2]	(btw that 1.6GB limit used to be a real problem when I was doing 3D stuff... 3D animation apps are memory hogs, and in some cases, we could only work 15 minutes before high-end apps would crash. which is a problem when a 3D scene takes 30 minutes to save to disk over the network ;-)
Maxim 9-Feb-2012 [2809x2]	can anyone explain a single use for this R2 path conversion? >> to-string first [path/item] == "pathitem" I know I can use mold... it's just that I wonder why to-string doesn't use the molded string equivalent as well?
Oldes 9-Feb-2012 [2811]	funny.. I was thinking about it today as well.. but I don't know
Sunanda 9-Feb-2012 [2812]	Something inconsistent in the way paths are handled: to-string load "path/item" == "pathitem" to-string to-path "path/item" "path/item"
Steeve 9-Feb-2012 [2813]	You can use FORM as well. And having alternatives should not be something to complain about. :)
Maxim 9-Feb-2012 [2814]	I'm not complaining, I just find absolutely no use-case for the default :-D my question was can anyone give me a reason for the current use of to-string? can you? ;-P
Ladislav 9-Feb-2012 [2815x2]	What is "OOn"?
Ladislav 9-Feb-2012 [2815x2]	Use case for default: to-string [1 "." 2] ; == "1.2"
Maxim 9-Feb-2012 [2817x2]	OOn == a typo :-) I guess I really meant something like O(n*n) Its the kind of dramatic linear vs logarithmic scaling difference when we unfold our datasets into parse. but its not exactly that kind of scaling, since the average topology of the sort tree will have a lot of impact on the end-result. for example in my system, when I try to index more than the first 5 characters, the speed gain is so insignificant that the ratio is quickly skewed, when compared to the difference which the first 3 letters give. Its 100% related to the actual dataset I use. in some, going past 2 is already almost useless, in others I will have to go beyond 5 for sure. in some other datasets we unfold them using hand-picked algorythms per branch of data, and others its a pure, brute force huge RAM gobler.
Maxim 9-Feb-2012 [2817x2]	I always love when I realize that I write things like this in Rebol: -&&&-: "a pretty impossible to guess variable name :-)"
Steeve 9-Feb-2012 [2819x2]	Max, although I think you're comparing O(1) vs O(n) parsing algorithms (random access vs linear) (The indexing part is probably meant to be O(n.log n) because it involves sorting data, but should be taken apart from the parsing cost) just wandering around, uhuh
Steeve 9-Feb-2012 [2819x2]	Anyway O(n*n) is by far too dramatic ;-)
Pekr 10-Feb-2012 [2821x2]	where should I put DLLs, in order for REBOL to find them? I mean - I have one DLL, which is dependant on some other. Even if I put that DLL into the same directory, it complaints it can't find it. Win Vista here ...
Pekr 10-Feb-2012 [2821x2]	or should I register them somehow using regsvr or something like that?
Oldes 10-Feb-2012 [2823]	I don't know how it's on Vista, but on W7 or XP you can place it anywhere... I today updated my old zlib script to do late initialisation, you can find it here: https://github.com/Oldes/rs/tree/88291b8c720e9026978a080ca40100c3f2fb780f/projects-dll/zlib/latest
Endo 10-Feb-2012 [2824]	Pekr: Registration (regsvr) is required only if they are ActiveX DLLs, but I think they are not because you cannot use ActiveX DLLs in REBOL. Normally they should be somewhere in your PATH. Try to see what's happening with FileMon tool from Systeminternals.com.
Maxim 10-Feb-2012 [2825]	it also looks in the current-dir... but that path will depend of how you launched rebol. use WHAT-DIR just before you try to load your dll to know where the current-dir is at that time and put your dll there. you can also add a path in the user or system path environment and place the dll there.
Pekr 11-Feb-2012 [2826x3]	I'll continue here for now, as /library is now a free part of Core, and DLL.SO is not web-public.
	My observation is, that if there are one or more dependant DLLs, REBOL will load first one, but then the path is somehow not taking into account a present directory. Here's few pointns: - you can't do: do %my-dir/my-dll-script.r - nor you can do so after: change-dir But it works, when you launch REBOL from the directory where those DLLs are present.
	There is several various paths in R2 structure, dunno if it is just weird R2 implementation, or OS level natural functionality ...
PeterWood 11-Feb-2012 [2829]	/library is not a free part of Core only View.
Geomol 17-Feb-2012 [2830x3]	If datatypes equals words, like word! = 'word!, then maybe the refinement in type?/word isn't needed? But what are the consequences? The next two examples would return the same: >> find [integer! 42] integer! == [42] >> find [integer! 42] 'integer! == [integer! 42] I came to think of this, because I find myself writing things like the following all the time now: either find [block! paren!] type?/word value [ ... and switch type?/word value [ ... If datatypes equals words, only type? without the refinement would be needed.
	I know, I today can write things like either find [#[datatype! block!] #[datatype! paren!]] type? value [ ... but I don't do that, because it has too much syntax for my taste, and therefore isn't very readable.
	Maybe the question should be put the other way around: Are there cases (in real scripts), where it would be a problem, if datatypes equals words?
older newer	first last