r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Core] Discuss core issues

Gregg
16-Oct-2010
[46]
I don't think we can throw away the meta info about it being a heredoc 
string, unless we want to use Oldes's idea of adding a refinement 
to MOLD. And then you have to choose when to use it. While we can 
say that it's easy for MOLD to add escapes to curly braces, that 
doesn't mean it will be easier for humans to read generated strings 
that contain them.


I can't say I've ever needed it personally. The { } syntax works 
well, and is as much support as TCL has for heredoc strings it seems 
(we even have TRIM/AUTO to help with indenting issues).  


I want to like the idea a lot, but I only like it a little so far. 
Mainly I wonder if it's worth adding for the sake of just the curly-brace 
chars. I understand the usefulness for the curly-language and test 
dialect scenarios, I'm just not sure of the cost/benefit ratio.
BrianH
16-Oct-2010
[47x3]
No refinement to MOLD needed. MOLD should know nothing about heredocs. 
Use a separate formatter function, like this:
mold-heredoc: func [value tag [string!]] [ 

 ajoin ["#[[" tag "^/" either string? :value [value] [mold :value] 
 "^/" tag "]]"] 
]
And once it is loaded, it is a string. There is no escaping in memory.
We *really* don't want to add another string type, since that would 
lead to conversion overhead. Another syntax for the existing, unchanged 
string! type is fine though.
Ladislav
16-Oct-2010
[50x3]
Yes, my point is exactly the same, even without any new datatype, 
etc., having just a new syntax to specify strings will be of advantage.
I hope, that we all agree, that, as opposed to other string syntaxes, 
there will be no escaping in heredoc, meaning, that the two strings 
below will be equal:

{^^}

#[[
^
]]
The proof is in the pudding - even now we do have two syntaxes for 
strings, and no string contains any information specifying which 
syntax was used (it is even possible, that none, since a string read 
from a file was not defined using any of the two.
BrianH
16-Oct-2010
[53]
Interesting, I was assuming that we were going to make the tag required. 
Mention that it is optional (or possibly an empty string, if you 
prefer) in the ticket.
Ladislav
16-Oct-2010
[54x2]
Whether the tag is optional or not - well, I am not firmly at either 
side. Nevertheless, the empty tag can be considered a tag as well, 
just a special one.
Will mention that in the ticket
BrianH
16-Oct-2010
[56]
As long as the constraints I mentioned in the first comment to that 
ticket are kept, I'm all for it. Plus, no MOLD refinements.
Ladislav
16-Oct-2010
[57]
That is OK with me
BrianH
16-Oct-2010
[58x2]
Let's also stick to text. No binary data in our strings.
Some languages have raw binary heredocs, but that doesn't work well 
with Unicode strings, nor does it post in text mode well.
Ladislav
16-Oct-2010
[60]
Agreed, I do not see any need to have binary heredocs
james_nak
16-Oct-2010
[61]
maxim - regarding namespaces: Are you kidding? I'm not even going 
to touch those. My excuse is if xml-object.r can't handle it, neither 
will I. :^(
Oldes
16-Oct-2010
[62x2]
Because it's complicated? Heredoc is enough for me and I'm lucky 
I'm not alone who found it missing.
oh.. my message is reply to Gramam's:

@Ladislav ... the { } only have special significance because the 
interpreter is so written ...  so why not make it user definable 
?
I've not noticed I'm not at the end of the chat :/
BrianH
16-Oct-2010
[64]
Three reasons:

- The { and } are not chosen at random, they are the consequence 
of using those characters to delimit strings themselves.

- There aren't many characters or character sequences that can be 
optional without conflicting with other stuff in the grammar.

- Syntax processors with user-defined stuff in them are much slower 
than ones without them.
james_nak
16-Oct-2010
[65]
Hi, I have a client whose ftp username has an ampersand @ in it. 
I think that's causing a problem accessing it. Is there a way around 
that or should that work?
Sunanda
16-Oct-2010
[66]
Try read (write etc) like this:

     read [ scheme: 'ftp host: domain.com user: user-domain.com pass: 
     "password" ]

Or see what other people have tried in the past:
    http://www.rebol.org/ml-topic-index.r?i=ftp
james_nak
16-Oct-2010
[67]
Thanks Sunanda. that was it.
Ladislav
17-Oct-2010
[68x3]
Regarding the "why the heredoc syntax was proposed" question, here 
is another reason, that is important for me, proving, that even users 
not planning to use other languages than REBOL can take advantage 
of it:


In my REBOL script files, I usually write code examples, or code 
tests, that are meant to demonstrate the newly defined functions, 
or to test whether they work as expected. That code is in no way 
meant to be run every time the script is run. Therefore, I use the 
COMMENT function to ensure the example/test code does not run every 
time the script is run.


Since it is a comment, I prefer to use a string to be able to write 
it "free-form" not being bound by the requirement of loadability 
of the text. However, when the code examples in the comment contain 
special "escaped characters", this would look ugly, therefore I rather 
give up the free-formness of the comment gaining the advantage of 
not being forced to escape the special characters, but being forced 
to keep the comment REBOL-loadable. The proposed heredoc syntax can 
solve this, and similar problems nicely and naturally.
(this property just "mimics" the property of the ; single-line comments, 
which does not need any character escaping as well)
Regarding the syntax name - if we want to use a scientific one, we 
can call the syntaxes "single-line syntax with escaping", "multi-line 
syntax with escaping", and "multi-line syntax without escaping".
Gregg
17-Oct-2010
[71x2]
I do the same thing as Ladislav with tests and comments, and having 
a separate HEREDOC func makes much more sense than a MOLD refinement. 
I'm still not keen on the name, but the scientific options are a 
bit long for func names. :-)

I'm fine with the tag being optional as well.


The discussion here, and comments on curecode, have addressed my 
current questions and concerns. Thanks to all involved for that, 
particularly Ladislav. I think it's absolutely worth a trial run 
to see if anything comes up in acutal use that isn't easily addressed 
with docs.
This also gives me more to think about with regard to how and why 
location markers might be done.
Ladislav
18-Oct-2010
[73x2]
Syntax names:

single-line syntax with escaping
 == "double-quoted"
multi-line syntax with escaping
 == "curly-braced"
multi-line syntax without escaping
 == "heredoc"
Actually, the "heredoc" name pretty well explains what is an area 
where it can be successfully used.
Maxim
18-Oct-2010
[75]
As the general conscensus stands I think this will be a GREAT addition 
to the syntax.   I've missed this notation in REBOL from the day 
I used other languages which have it.
Ladislav
18-Oct-2010
[76]
:-) I surely missed it before using other languages which have it. 
(since I did not use such a language yet)
Gregg
18-Oct-2010
[77]
- Single-line
- Multiline (at least that's the term used up to now)
- Heredoc, Raw, Unescaped  ???
Maxim
18-Oct-2010
[78]
what's the termed generally used for <pre> </pre> tags in html?
Sunanda
18-Oct-2010
[79]
pre-formatted?
Gregg
18-Oct-2010
[80]
Preformatted text.

I like it.
Maxim
18-Oct-2010
[81x2]
yeah, I think its a bit more human understandable... text appears 
"as-is" .
it also seems natural to say that when files are read, they are pre-formatted.
Oldes
18-Oct-2010
[83]
The functionality is different. If PRE tag would be woking as heredoc, 
than it would display all tags without need to escape < to &lt;
Maxim
18-Oct-2010
[84x3]
yes, but this is a string format, not a rich-text format.  so the 
only formatting you can give it lines, tabs & spaces.
pre tag should always have had a  tag=""  paramter which allowed 
it to skip html content until it found a closing <pre> tag with the 
same tag attribute (my 2 cents).
also, historically R2 view had the "as-is" identifier which meant 
to preserve the formatting, but it was still limited to the language's 
lexical parser... this could be used instead. 


meaning roughly... don't interpret the stream of bytes as containing 
any codes, just use it as a stream of characters  "as-is"..
Ladislav
18-Oct-2010
[87]
Nevertheless, Oldes is right, that <pre> uses escaping, so it is 
not analogical to heredoc in this respect
Maxim
18-Oct-2010
[88x2]
yes... I am trying to say we are doing  <pre> ... </pre>   since 
its roughly equivalent to  { ... }
*not trying*
Ladislav
18-Oct-2010
[90]
(or, I should say to the heredoc syntax I am proposing, since it 
seems, that some heredocs use some escaping)
Maxim
18-Oct-2010
[91]
I'm just saying that "pre-formatted" is a nice differentiating term:

single-line, multi-line & pre-formatted.
Andreas
18-Oct-2010
[92x3]
single-line, multi-line, heredocs
Please stop fussing over the name and just stick with heredoc, which 
is a widely used and well-established notion.
Heck, there's even a Wikipedia page on it: "A here document (also 
called a here-document, a heredoc, a hereis, a here-string or a here-script) 
is a way of specifying a string literal"
ChristianE
18-Oct-2010
[95]
Seconded, there is exactly *no* reason I can think of to give heredoc 
like strings a name other than "heredoc" string. Even stackoverflow.com 
features a heredoc tag.