Data retrieval
[1/8] from: gisle:dankel:no at: 30-Jan-2004 14:10
Hi fellow rebols,
I'm writing a script to collect data from various web pages and send the results in an
email.
Most web pages are ok but a few cause some problems.
Here's an example:
www.tradesports.com
This page allows you to place bets on various topics.
I want to extract this kind of information, but the links trigger javascript functions
so there appears to be no direct URLs to the resulting pages.
I figure I need to open an HTTP port and post whatever the javascript function sends
myself.
Here's an example of a javascript function call in a link:
<a href="#" onClick="javascript:trade(88190);return false;">Trade This Now!</a>
And here's the trade function:
function trade(conID)
{
document.POSTGSX.request_operation.value = "trade";
document.POSTGSX.request_type.value = "action";
document.POSTGSX.selConID.value = conID;
document.POSTGSX.location.value = "TradeCentre";
document.POSTGSX.submit();
}
I don't know much about javascript and HTTP post.
From what I can gather I need to send the same information to the server as the submit()
function above, but I have absolutely no idea of what information is sent or how to send
it. Also, how do I receive the result? Can I open an http port to achieve this?
Any help greatly appreciated!
Cheers,
Gisle
[2/8] from: hallvard:ystad:oops-as:no at: 30-Jan-2004 19:45
Dixit Gisle Dankel (15.10 30.01.2004):
>www.tradesports.com
>Here's an example of a javascript function call in a link:
<<quoted lines omitted: 7>>
> document.POSTGSX.submit();
>}
POSTGSX is the name of a form on the web page. request_operation etc. are the different
parameters in the form. Their values are set with the javascript. You need to post something
like this to the server:
request_operation=trade
request_type=action
selConID=conID ; remark: replace this with whatever value the variable represents (i.e.
whatever number is passed to the javascript function, e.g. 88190)
location=TradeCentre
As for how to POST via the HTTP protocol, cf. http://www.w3.org/Protocols/rfc2616/rfc2616.html.
[3/8] from: hallvard:ystad:oops-as:no at: 30-Jan-2004 20:05
Oh, and here's two more links:
Martin Johannesson has written a script to do the http posting for you:
http://www.reboltech.com/library/html/http-post.html
And you might find some of these scripts useful:
http://www.oops-as.no/rix?q=HTTP+post&ct=rebol
HY
Dixit Gisle Dankel (15.10 30.01.2004):
[4/8] from: tomc:darkwing:uoregon at: 30-Jan-2004 10:58
Hi,
To know what a browser is saying to a server I wrote a
'remote echo proxy' several years back and am slowly improving it.
I recently changed to echos the request into a frame above the page
and have not tested it very much at all but you are welcome to try.
to try the current version, set your browsers http proxy to
Proxy: bionix.cs.uoregon.edu
Port: 3776
It is not really ready for prime time but often gets the job done.
now that I have started playing with 'view I am tempted
to make a 'local echo proxy' people could run on their own machines which
would be an awful lot easier to make robust.
when directed to your page and link returns:
POST http://www.tradesports.com/ HTTP/1.0
User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows XP) Opera 6.05
[en]
Host: www.tradesports.com
Accept: text/html, image/png, image/jpeg, image/gif, image/x-xbitmap, */*
Accept-Language: en
Accept-Charset: windows-1252;q=1.0, utf-8;q=1.0, utf-16;q=1.0,
iso-8859-1;q=0.6, *;q=0.1
Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0
Referer: http://www.tradesports.com/
Proxy-Connection: close
Content-type: application/x-www-form-urlencoded
Content-length: 151
request_operation=trade&request_type=action&contractBook=none&mdView=0&incpage=&isLive=true&selConID=6227&eventSelect=&location=TradeCentre&updateLis
On Fri, 30 Jan 2004, Gisle Dankel wrote:
[5/8] from: gchiu:compkarori at: 31-Jan-2004 18:38
Tom Conlin wrote.. apparently on 30-Jan-2004/10:58:47-8:00
>Hi,
>
>To know what a browser is saying to a server I wrote a
>'remote echo proxy' several years back and am slowly improving it.
>
Sterling also wrote a http proxy that you could use I guess
http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?script=proxy.r
--
Graham Chiu
http://www.compkarori.com/cerebrus<
[6/8] from: hallvard::ystad::oops-as::no at: 30-Jan-2004 23:48
Dixit Graham Chiu (21.44 30.01.2004):
>Sterling also wrote a http proxy that you could use I guess
>
>http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?script=proxy.r
Which reminds me that I too once wrote a cacheserver that you couldt take a look at...
:
http://www.oops-as.no/roy/rebol-scripts/cacheserver.r
HY
[7/8] from: gisle:dankel:no at: 1-Feb-2004 23:34
Thanks for all replies!
I managed to do what I wanted using Sterling's proxy script.
Now, there is one remaining problem: Retrieving information from https
pages.
Since REBOL doesn't support https I assume this will be rather difficult...
Cheers,
Gisle
[8/8] from: gchiu:compkarori at: 2-Feb-2004 13:04
Gisle Dankel wrote.. apparently on 1-Feb-2004/23:34:43
>Thanks for all replies!
>I managed to do what I wanted using Sterling's proxy script.
>
>Now, there is one remaining problem: Retrieving information from https
>pages.
>Since REBOL doesn't support https I assume this will be rather difficult...
>
Rebol/command supports https, but the trouble is that all the data going thru the proxy
will be encrypted ...
--
Graham Chiu
http://www.compkarori.com/cerebrus<
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted