r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Dialects] Questions about how to create dialects

Steeve
27-Jun-2010
[575]
can't you use one char length for those ? 
OR AND NOT = | & !
Fork
27-Jun-2010
[576x9]
Empirically those are not good candidates for one character optimization.
i (if-true-mu) and e (either-true-mu), which can be remapped but 
are by default mapped to their equivalents "IT" and "ET" (there's 
also WT for while-true) are pretty good one-character ones, but even 
those can be questionable.  But they are reasonable defaults.  The 
point of one-character-choices in Rebmu is to anticipate the need 
for flexibility for redefining them, to use as variables or whatever.
Originally I had "r" and "w" for "readin-mu" and "writeout-mu" but 
found that w for while was more frequent than the number of writes, 
and in fact I also found that my writeout was not as useful as print's 
default behavior, so I remapped w.  These one-character decisions 
are in flux as I look at how some of my ideas are panning out in 
real solutions.
This "mushing" I've come up with is delicate, because of course I 
can't break Rebol's parser... so I have to work within it, and the 
language is beholden to what forms can be naturally combined without 
spaces vs. those that require them.
Anyway, the various crack-smoking mods I'm doing (such as if-true-mu 
being able to take constants in the code block and evaluate to those 
constants without the brackets) are intended to be minor things... 
something one could quickly unlearn in Rebol programming and say 
"Oh, right, you need a block around that."  Increasingly I'm backing 
off anything which is overly focused on character optimization at 
the price of creating a pattern that is too far off from a reasonable 
Rebol programming practice.
It applies to for instance not Huffman encoding the names for the 
sheer sake of saving characters.  The abbreviation has to line up 
with the Rebol word in some vaguely reasonable way.
Another issue of using single character symbolic things for operators 
is that because they lack a "case", they don't play well with mushing. 
 [A&b] unmushes to [a: & b]...  [a&b] unmushes to [a&b]... [a&B] 
unmushes to [a& b] .... and [A&B] unmushes to [a&b:]
Consequently, you're better off with a two character A~ for AND~, 
because you'd end up having to throw in spaces on the left and right 
of the & all the time otherwise, which gets you three characters, 
and you'd probably have to throw one on the left half the rest of 
the time.
Is there a "weirdest dialect" award I'm going to win with this?  
:)
Steeve
27-Jun-2010
[585]
Well, you could prevent not alphabetic chars from being part of any 
word. It would make sense.|
Fork
27-Jun-2010
[586]
It doesn't make sense if trying to retain compatibility with lowercase 
Rebol code.  I want to be able to paste "to-string" in the middle 
of a Rebmu program and have it "just work"
Steeve
27-Jun-2010
[587]
hum ok
Fork
27-Jun-2010
[588x4]
Again, that is the baby in this bathwater.
To superset Rebol when coded natively, in lowercase.
I'm glad you tackled the connecting dots problem, though I don't 
think your solution worked... but it was in the spirit of my desire 
to see Rebol step up to these challenges and prove itself.  Because 
if it can win at anything, it can win at this.
And like I say about my Hourglass solution vs. the "winning" golfscript 
solution... it broke the rules, it has no reusable parts, it exploits 
symmetries and gimmicks in the problem such that if the problem specification 
changed only slightly you'd have to rewrite the whole thing.
Steeve
27-Jun-2010
[592]
Well, i saw the remaining problems in my code, correcting them would 
not enlarge so much the length of the code
Fork
27-Jun-2010
[593]
In solving it I found that like with many code golf problems there 
are some little "nuances" which emerge as you look at it...
Steeve
27-Jun-2010
[594x2]
Bit I don't think we can beat the shortest solution
*But
Fork
27-Jun-2010
[596x2]
Specifically, there are cases when lower numbered dots can connect 
and cross over the second digit of a larger numbered dot
That forces you to basically build a map of the numbers before you 
start drawing the lines :(
Steeve
27-Jun-2010
[598]
Yes I finally understood that point
Fork
27-Jun-2010
[599x2]
A lot of these problems have gotchas like that where you think "oh, 
easy I'll just do it like this..."
Then you get there and realize "oh crap" and suddenly you've added 
like 50 characters
Steeve
27-Jun-2010
[601]
:-)
Fork
27-Jun-2010
[602]
Did you go through my solution at all?
Steeve
27-Jun-2010
[603]
Sorry not at all, If you could decipher your code for me...
Fork
27-Jun-2010
[604x16]
It's simple at heart.
Well, if what BrianH is true I may need to write "unrebmu" sooner 
rather than later
So this line here: "w [j: d ++ n] [ro g [x j y j]]" is that building 
of the initial coordinate map.  It keeps incrementing n, and passing 
it to the "d" function which returns either the coordinate pair where 
that digit resides (or none if the digit could not be found).
while [j: d (++ n)] [repend/only g [x (j) y (j)]]
Er, no the d function returns the series position
And it is the x and y functions which translate those into coordinates.
So the issue we discussed about needing to build the map ahead of 
time is taken care of right there.  After this step, G is an ordered 
series of two-element blocks... each with two integers, the coordinate 
pair of where that dot is.
Then clearly we loop through G.  On each loop iteration, B is the 
coordinates of the start of our pen and F is the location of the 
finish of our pen.
We keep bumping B until we reach F.  How much do we bump the X coordinate 
of B and how much do we bump the Y coordinate of B on each iterator 
step?  That is stored in H and V which are in the set (-1, 0, 1).
; string containing the (l)etters used for walls
l: {-|\/}
Which letter we use to draw the wall depends on h and v, right?  
If our vertical bump is zero, we know we want the first element. 
 If our horizontal bump is zero we know we want the second.  If the 
bump is [1 1] or [-1 -1] we know we want the "/" (third element), 
otherwise the fourth ("\").  This line is really rather boring:
ch c b pc l ez v 1 [ez h 2 [ee h v 3 4]]
change c b pick L either-zero v 1 [either-zero h 2 [either-equal 
h v 3 4]]
Hopefully you get the idea.  It's really simple.  I'm not trying 
to beat the winners based on libraries or tricks, I'm not trying 
to write clever code, I'm writing REALLY boring straightforward code 
with function definitions and everything which can be adapted easily 
if the problem spec is adapted.  And still within a stone's throw 
of the winners, or even beating them.  In this case I got 218 characters 
compared to the winning "perl" at 222.
And that perl is garbage.  No functions, no reusability, no intelligence.
Meaningless letters, far from being able to handle N-digit numbers... 
y'know, the usual.  I'm standing on the shoulders of giants here, 
but that's why I can see farther.
BrianH
27-Jun-2010
[620x2]
Your unrebmu proposal would be a compiled dialect, with the target 
being DO dialect code. This is a good technique for many dialects, 
and can be very fast. That makes it a good idea for a demo.
I wonder how rebmu would compare to COMPRESS for space savings? Compressed 
scripts/modules are planned for R3 (actually, implemented but not 
included yet). This includes binary! syntax encoded compressed data.
Fork
27-Jun-2010
[622]
It's a good question, and regarding that, I noted that many of the 
solutions to Code Golf problems utilize base-64.
Rebolek
27-Jun-2010
[623]
Janko wanted some script obfuscator and as I see, Rebmu is already 
less readable than Brainfuck, so I think it's a good candidate.
Fork
27-Jun-2010
[624]
@Rebolek: For the sake of the psychological testing, these Code Golf 
problems I'm solving... I'm actually solving *in Rebmu*, mushed form.