View script | License | Download documentation as: HTML or editable | Download script | History |
[0.054] 17.438k
Documentation for: webprint.rUsage document for %webprint.r1. Introduction to %webprint.rwebprint.r is an introduction to the ease and simplicity of accessing internet URLs and HTML.
2. webprint At a GlanceNot setup is required, just do it. >> do %webprint.r 3. Using %webprint.rRequires REBOL/Core or REBOL/View console mode. 3.1. Running %webprint.rFrom the library with: >> do http://www.rebol.org/cgi-bin/cgiwrap/rebol/download-a-script.r?script-name=webprint.ror locally with: >> do %webprint.r 4. See alsoThere is another rebol.org script, very similar to this one, 5. What you can learn5.1. Powerful builtin Internet AccessREBOL has fantastically simple builtin procedures for accessing the internet. 5.2. URLshttp://www.rebol.com is actually a value with a special datatype. In REBOL this is a url!. Very powerful. No quotes needed. REBOL just knows. 5.3. Web Server defaultsWeb Servers have default files that are returned. http://www.rebol.com is actually returned as http://www.rebol.com/index.html. This is not always the case. Some sites return default.htm, or index.php, or index.cgi. No need to worry, the REBOL read function and the web server will work that all out for you. If REBOL Technologies ever changes its web server setup, a different file may be returned and this script will still work. 5.4. Changing the page printedChanging the website or page that is printed is as easy as changing the text after the http:// part following the read command. 5.5. Getting REBOLBoth REBOL/Core and REBOL/View are available 5.6. Compare the complexity to the simplicityPlease compare the print read http://www.rebol.com 31 character sequence to this D language program. 5.6.1. D Language sample for printing HTML as text/* HTMLget written by Christopher E. Miller This code is public domain. You may use it for any purpose. This code has no warranties and is provided 'as-is'. */ //debug = HTMLGET; import std.string, std.conv, std.stream; import std.socket, std.socketstream; int main(char[][] args) { if(args.length < 2) { printf("Usage:\n htmlget <web-page>\n"); return 0; } char[] url = args[1]; int i; i = std.string.find(url, "://"); if(i != -1) { if(icmp(url[0 .. i], "http")) throw new Exception("http:// expected"); } i = std.string.find(url, '#'); if(i != -1) // Remove anchor ref. url = url[0 .. i]; i = std.string.find(url, '/'); char[] domain; if(i == -1) { domain = url; url = "/"; } else { domain = url[0 .. i]; url = url[i .. url.length]; } uint port; i = std.string.find(domain, ':'); if(i == -1) { port = 80; // Default HTTP port. } else { port = std.conv.toUshort(domain[i + 1 .. domain.length]); domain = domain[0 .. i]; } debug(HTMLGET) printf("Connecting to " ~ domain ~ " on port " ~ std.string.toString(port) ~ "...\n"); auto Socket sock = new TcpSocket(new InternetAddress(domain, port)); Stream ss = new SocketStream(sock); debug(HTMLGET) printf("Connected!\nRequesting URL \" ~ url ~ "\"...\n"); if(port != 80) domain = domain ~ ":" ~ std.string.toString(port); ss.writeString("GET " ~ url ~ " HTTP/1.1\r\n" "Host: " ~ domain ~ "\r\n" "\r\n"); // Skip HTTP header. char[] line; for(;;) { line = ss.readLine(); if(!line.length) break; const char[] CONTENT_TYPE_NAME = "Content-Type: "; if(line.length > CONTENT_TYPE_NAME.length && !icmp(CONTENT_TYPE_NAME, line[0 .. CONTENT_TYPE_NAME.length])) { char[] type; type = line[CONTENT_TYPE_NAME.length .. line.length]; if(type.length <= 5 || icmp("text/", type[0 .. 5])) throw new Exception("URL is not text"); } } print_lines: while(!ss.eof()) { line = ss.readLine(); printf("%.*s\n", line); //if(std.string.ifind(line, "</html>") != -1) // break; size_t iw; for(iw = 0; iw != line.length; iw++) { if(!icmp("</html>", line[iw .. line.length])) break print_lines; } } return 0; }What would you rather type? The above or something like print read join http:// ask "Web site? " What will be easier to remember 6 months from now? 6. What can breakYou will need access to the internet and rebol.com will have to be up and running for this script to work. Don't worry, http://www.rebol.com is always up and running. 7. Credits
|