r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL2 Releases] Discuss 2.x releases

BrianH
14-Apr-2010
[1431]
MonetDB looks nice, and it has ODBC drivers for a start. Column store 
is good for analytics.
Graham
14-Apr-2010
[1432]
But from Linux?
BrianH
14-Apr-2010
[1433]
There is apparently a TCP interface, so drivers could be written 
that would take advantage of its strengths better than ODBC would.
Graham
14-Apr-2010
[1434]
Which reminds me, what's stopping RT from supporting unixodbc?  I 
think Carl said there was a proliferation of odbc methods for Linux, 
but as far as I can tell they've now standardized on unixodbc
BrianH
14-Apr-2010
[1435x2]
And you can use unixodbc on Linux (but can you use it from R2 on 
Linux?).
Sorry, missed your message :)
Graham
14-Apr-2010
[1437]
I had a quick search for the TCP/IP interface docs .. couldn't find 
any yet
BrianH
14-Apr-2010
[1438]
To answer your question, my guess would be time and money. R2 native 
enhancements that RT doesn't need itself need to be funded nowadays.
Graham
14-Apr-2010
[1439]
http://monetdb.cwi.nl/SQL/Documentation/Programming-Interfaces.html
BrianH
14-Apr-2010
[1440]
OK, text protocol over TCP or SSL. That sounds doable.
Graham
14-Apr-2010
[1441x3]
Presumably that is handled by their library
ODBC is there to avoid having to write drivers to every db under 
the sun ...
Carl says the Windows Rebol ODBC source is only two pages of C code 
...
BrianH
14-Apr-2010
[1444x2]
Nice. Have him post it to DevBase in an appropriate directory so 
we can improve it :)
Looks like the best approach to MonetDB support for R2 would be to 
look at the source of the Python driver, which is pure Python, no 
C.
Graham
15-Apr-2010
[1446]
Links to how to program unixODBC http://www.unixodbc.org/doc/ProgrammerManual/Tutorial/
TomBon
15-Apr-2010
[1447]
graham I am using a 'prototype multi cli connector' for different 
databases.

(currently mysql/postgresql/monetdb/sqlite. more are in progress) 
I will provide you

with a source link in a couple of days if you like. I would never 
ever use odbc, it's to unstable

and makes always problems when the db load is highest. but ok I also 
don't like synthetical 

benchmarks or theoretical feature lists. real life experiences are 
best...
Graham
15-Apr-2010
[1448x2]
interesting ... please
Got any more information?
TomBon
15-Apr-2010
[1450]
like this?
the cli connector is using the cli component nearly all major
databases delivering. the connection is made via rebols 

call/wait/info/output/error and a simple parse after, for the resultset.
I am using this prototype mainly for a q & d connect

to mysql/postgresql/monetdb/sqlite. on my list are also connectors 
for

firebird/oracle/greenplum/sybase/ingres/infobright/frontbase and 
cassandra.
pros:

1. very fast for single requests
2. no rewrite of code needed if a new version or protocol is out
3. easy 'data migration' between the db's

4. adding new db's are a matter of hours only (see the cli spec thats 
all)
5. fast prototyping and testing for new db's

6. robust, never had any trouble with cli's even with bigger resultsets

7. should be perfect also for traditional cgi (the process starting 
overhead is minimal, execpt you name is facebook)

8. very small footprint (~120 lines for connecting to 4 db's, could 
be the half)

with a nice tcp-server component like rebservice the 
cli multi connector could be very usefull as a c/s connector.
I made a test with 2.000 concurrent calls (simple select) 
on a 4 gig quadcore. the cpu was only close to 50%, a good value.

cons:


1. slow if you have very much serial inserts (unless you shape them 
into one sql query)
2. need to start a cli process for every request
3. needs a tcp server for non-local connections
4. some more, but who cares ;-)

with a solution to keep the cli open from rebservice,

these cons could disappear and the speed diff overhead to a memory 
based lib
could be marginal.
Graham
15-Apr-2010
[1451]
2'000 concurrent calls ??
TomBon
15-Apr-2010
[1452]
yes
Graham
15-Apr-2010
[1453]
How did you do that?
TomBon
15-Apr-2010
[1454]
call without /wait in a loop
Graham
15-Apr-2010
[1455x2]
ok
to keep the cli open, using telnet into localhost ?
TomBon
15-Apr-2010
[1457x2]
well if this is working the connector will be great. this weekend 
I can post the source so far...
the test I made was against a real big table with 50+ mio records. 
no problem at all.
Graham
15-Apr-2010
[1459]
I wonder how it works inserting large text ( blobs )
TomBon
15-Apr-2010
[1460x2]
I don't see any problem. cli's are also used for very large dumps 
or restores.
as said I am impressed by the robustness of this approach.
Graham
15-Apr-2010
[1462]
Ok, I'm waiting for the report on Saturday
TomBon
15-Apr-2010
[1463]
sure captain ;-)
Graham
15-Apr-2010
[1464]
I'm getting my red pencil ready :)
TomBon
15-Apr-2010
[1465x2]
:-))
will post the link saturday into the db group.
Pekr
15-Apr-2010
[1467x2]
is CLI connector done in REBOL? Or does it call some external command-line 
cross-db access tool and REBOL just parses the data?
Theoretically it could be done all in REBOL. E.g. SQLite has sqlite.exe, 
MS SQL has some executable for querying the db too. I just thought, 
that calling command line tool is going to be orders of magnitude 
slower, than ODBC access ... at least under Windows ...
TomBon
15-Apr-2010
[1469x2]
pekr, can't confirm this (under linux where I am using this connector). 

a standard hot wildcard query (select * from db limit 1000) takes 
in average 320 ms 

with cli and 560 ms with docs cool mysql driver which I am using 
daily. 

I think it looks different if you compare e.g cli against sqlite, 
connected via native 

lib access but all connectors working via tcp shouldn't be faster 
then cli I guess. 
but please don't nail me with these numbers.

this cli connector is currently a prototype idea with some nice potential 
at this

moment nothing more, at least it works very smooth for migration 
tasks.

the best is if you make your own tests and see if its usefull for 
your demands.
one addition: increasing/decreasing the resultset makes the difference 
much bigger 

in both directions. selecting 5000 records: cli/620 ms and scheme/3340 
ms
but selecting 10 records: cli/316 ms and scheme/35 ms.

so looks like the payload for starting the cli process is around 
300 ms.

as mentioned before, a concept holding the cli stable alive could 
save this payload.
Graham
15-Apr-2010
[1471x3]
2'000 concurrent calls .. 50% of 1 cpu or of all 4 cores ?
If this works out, it might be cool to write it as a port scheme 
so that we can just replace the 'open
Many of these isql/cli tools are open source ... I wonder how easy 
it would be to modify them to allow greater interactivity
TomBon
15-Apr-2010
[1474x2]
the cli is currently using the original utilities from the db manufacturer 
to ensure max. performance and robustness. 

there are already many switches to modify out & input. for example 
take a look here for the monetdb switches:

http://monetdb.cwi.nl/XQuery/Documentation/The-Mapi-Client-Utility.html#The-Mapi-Client-Utility
there are 3 extension which would be very cool. 

1. tcp server for easy remote request  without the need for the cli 
on the client side (e.g. rebservice)  

2. a smart sql-syntax mapper for interactive migration (you can't 
read e.g. a mysql dump directly into postgresql) 

3. a stable cli alive holder to eliminate the startup payload for 
the request.
Graham
15-Apr-2010
[1476x2]
Something along the lines of the xml-odbc server that Easysoft sells 
.. but native and not odbc.
This waits for incoming xml requests and sends data back
TomBon
15-Apr-2010
[1478x2]
yes, exactly but without the word odbc :)
btw, the cpu load was ~50% over all cores for approx. 5 sec. on an 
ubuntu dektop 8.10 running also 2  vbox vm's.
Graham
15-Apr-2010
[1480]
Interbase have a developer release that is multicore aware.  I'd 
be interested to test this once you do the first release.  The developer 
release is same as the commercial one but stops receiving new connections 
after 48 hours.