r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[SQLite] C library embeddable DB .

Pekr
16-Feb-2006
[93x5]
and somehow - I liked the ability of RebDB to define dbs in a simple 
way - you just named columns and did not have to care about types 
even ....
I am also used to rebol blocks. Somehow using string mode for constructing 
sql query is one level off for me, but otoh maybe better, because 
with block mode you sometimes don't cover all syntax options or you 
simply has to learn how to use it, whereas with string you simply 
compose default sql query and you can cut and paste examples or queries 
from docs, visual tools, etc.
as for visual tools - just suggest me one. I run thru installation 
of most of them. I did not find ANYTHING in a quality of mySQL Admin 
and mySQL Query. Thwo of them I liked are commercial, so imo sqlite, 
as a database of the year, is really badly supported here ....
Now at work - one more note, Ashley - my typical app design is as 
follows - application directory containing exe or modules, .cfg files, 
\system subdirectory, which contains other subdirectories as \cyphre-styles, 
\rebgui, \rebdb, \sqlite, according to usage, and \data subdirectory 
... so I am not sure that by default the driver should create the 
directory, as in the contrary when I want simply to have my data 
in \data dir, I don't want the driver to automatically create \my-db-name 
subdir ....
what is more - subdire has some meaning, if sqlite would use one 
table per file design, not sure here, I am opened to thoughts ....
Ashley
16-Feb-2006
[98]
I've given up on the one table per database idea as the default attachment 
limit is 10 files. On the positive side, the ATTACH command seems 
to work properly under 3.0+ and table names unique to an attached 
database do not have to be prefixed in queries. ;) My 'connect function:

>> help connect
USAGE:
    CONNECT database /create /attach databases /blocked /direct

DESCRIPTION:
     Open a SQLite database.
     CONNECT is a function value.

ARGUMENTS:
     database -- (Type: file url)

REFINEMENTS:
     /create -- Create database if non-existent

     /attach -- Attach up to 10 databases (of type file!) to this connection
         databases -- Database files (Type: block)
     /blocked -- Return each row as a block
     /direct -- Do not mold/load REBOL values

lets you do this:


 connect/attach %/c/app/data/system.db [%/c/app/data/users.db %/c/app/data/reports.db]
	sql "select * from a-system-table"
	sql "select * from a-user-table"
	sql "select * from a-report-table"


which could be quite useful in many situations. The default, however, 
is now one database.


BTW, my rewrite (even after adding a lot more functionality) is about 
twice as fast as the original protocol version.
Pekr
16-Feb-2006
[99x3]
huh - when can we expect beta? I am first one who wants to test :-)
I have no opinion on molded values issue .... maybe we could have 
something like /custom, with a dialect, or not so complicated, just 
/molded refinement, where you define which columns to mold - if that 
makes sense and would actually speed anything up?
other thing is, if we should support /object as original scheme did? 
Even  with odbc, some time ago, I simply created map-record function, 
which mapped record to object, for easier access (block position 
independent) .... dunno if you find that possibility usefull though 
....
Sunanda
16-Feb-2006
[102]
<<SQLite "databases" act more like traditional "tablespaces">>
That's a nice flexible approach.

It may add an apparent unnecessary level of complexity for small 
databases, but the tablespace approach is intended to scale almost 
unlimitedly.

Think to when REBOL has taken over the world, and we have tables 
that exceed a single disk drive. Tablespaces exist (in part) to handle 
that sort of issue.
Pekr
16-Feb-2006
[103]
what is tablespace?
Sunanda
16-Feb-2006
[104]
It's a data space that contains tables.

A tablespace can be split across disk volumes (or servers or machines)
And a table is defined as residing in one or more tablespaces.


The unit of back up is a tablespace -- or a database (which consists 
of 1 or more tablespaces)

It adds a lot of flexibility for large systems. But can be overkill 
for smaller ones....Where you probably just have one tablespace that 
lives in one folder.
Ashley
16-Feb-2006
[105]
OK, initial beta release can be found at: http://www.dobeash.com/files/SQLite.r
Pekr
16-Feb-2006
[106x6]
ok, please inform us here, if you do some changes/fixes :-) Thanks 
a lot ...
Ashley - maybe I vote for functions having sql- or sqlite- prefix 
.... sql then could become sql-query .... because ... standalone 
"disconnect" in rebol script is a bit strangely looking .... in rebol 
you can disconnect from many things ... it does not look obvisous 
at first sight .... what do others think?
also open, close vs connect, disconnect ... evne rebdb has db-open 
.... looks good ... maybe db- would be nice, telling and short prefix 
...
hmm, now I found oconnect refinements uncomfort ....
I would prefer setting /blocked or not, /direct or not, as a result 
fo query, so maybe those refinements should be available with /sql 
function instead?
could we mark this group [web-public]? Would anyone be offended?
Ashley
16-Feb-2006
[112]
Done. I've also updated SQLite.r to correct its mold/all handling 
and changed the /blocked refinement to /flat (the driver now returns 
rows as blocks by default).
Ashley
17-Feb-2006
[113x2]
Bumped the version to 0.1.2 with three enhancements:

1) Automatic NULL -> none! conversion (and vice versa)

2) 'sql now handles a SQLITE_BUSY return code by retrying the failing 
sqlite3_step 5 times at one second intervals

3) About value (number of retries) can be set with a new connect/timeout 
refinement
About -> Above
Pekr
17-Feb-2006
[115x7]
Ashley - why did you name somehow not traditionally functions as 
connect and disconnect? To not clash with open and close natives? 
:-)
Ashley - why can't we support multiple name-spaces?
can we have opened only one db at a time? And if we want to open 
other, do we have to attach, no other chance?
Would it also be possible to extend e.g. 'headings to return non-string 
 values? (but I wonder if heading can contain space, if so, that 
is bad) .... because it would allow for following contstructs:

cols: headings
== [ID Name Address]

values: [1 "Petr" "Czech Republic"]

set cols values
I also don't agree with exposing functions as you did it, without 
prefix of sqlite or db .... that aproach is flat ... and unless it 
is not forbidden by sqlite itself, I don't see a reason to limit 
ourselves to have just one openened database .... I will shortly 
try it ...
I assigned whole anonymous context to sqlite: , to have access to 
sqlite/dbid, which can be submitted to other library wrapper functions 
to try some stuff. In current state, 'connect returns true or false, 
I want handler though ...
maybe it would be better to introduce one more function - if connect? 
db1: connect %my-db.db
sqlab
17-Feb-2006
[122]
It seems you want cursors.

Then you can separate preparing of an sql statement and fetching 
from the cursor.

This would allow to reset an already compiled statement and to redo 
 do it without a new compilation.
Or even to cache compiled statements as Caché does it.
Pekr
17-Feb-2006
[123x2]
I am not sure I want cursors - I just want to be able to work with 
multiple databases, without the need to close prior one ....
the problem imo is, that current driver does not return db handler 
anymore, but only true or false ...
Ashley
17-Feb-2006
[125]
1) why did you name somehow not traditionally functions as connect 
and disconnect?


open and close are already taken and I wanted to avoid pathing (sqlite/open) 
or prefixes (sqlite-open) as the number of words added to the global 
context is relatively few and I wanted them to be meaningful words 
in their own right (same reason why VID doesn't have vid-layout, 
vid-center-face, etc). Connect and disconnect are the commands used 
to login/logout of another user in many databases; so while the context 
is not strictly correct, they are recognisable DB words.

2) why can't we support multiple name-spaces?


You can via the connect/attach refinement (and this lets you transparently 
reference tables across multiple "databases"). Why would you want 
to concurrently access multiple databases that are *not* related 
to each other in any way?

3) can we have opened only one db at a time?


Yes, bar /attach as above. The benefit of this approach is that you 
minimise the number of file locks a single process obtains, and you 
don't have to track state information (i.e. you don't need to tell 
'sql which DB you are referring to).


4) Would it also be possible to extend e.g. 'headings to return non-string 
 values?

No, as it may contain strings like "count(*)"

5) don't agree with exposing functions as you did


I tend to write CONNECT, SQL, etc in upper-case so db references 
embedded in my code stand out. Come up with a better naming scheme 
that everyone likes though and I'll adopt it. ;)
Pekr
17-Feb-2006
[126]
ad 1) I was referring to the context of foreign script reader understanding 
- you used db-open even for rebDB - as I said, it is a detail, but 
makes sometimes things more clearer ...


ad 2) I was imagining simply kind of server product, which can open 
multiple unrelated databases ....


ad 3) kind of disadvantage here to not refer by pointer. We are used 
to it, no? Look at /Command or other schemes - db: open some-db://..... 
conn1: first db conn2: first db ... db2: open some-db://..../other-db

ad 4) OK


ad 5) db- or sqlite- prefix or let it the way it is, I will assign 
it to sqlite: context [] directly in %sqlite.r
Ashley
17-Feb-2006
[127]
you used db-open even for RebDB

 ... that's because it needed to define about 20 low-level SQL statement 
 functions like 'select, 'delete, ''show, 'close, 'insert, etc that 
 conflicted with existing REBOL words and could not be easily renamed 
 (folks expect a SELECT statement, an UPDATE statement, etc or something 
 that resembles it). With the SQLite driver, all these statements 
 are handled by the SQLite library itself so we only need to provide 
 a few high-level accessor functions like SQL and DESCRIBE; these 
 types of words don't conflict with REBOL's lower-level atomic action 
 type words so there isn't the same need to group and isolate them 
 by prefix and/or context.


kind of disadvantage here to not refer by pointer. We are used to 
it, no?

 But is it *needed*? If you *must* work with multiple databases concurrently 
 and *cannot* have them attached, then SQLite is probably not the 
 right solution for you anyway ... it's an embedded C library with 
 limited concurrency and no user rights management after all. ;)
Pekr
17-Feb-2006
[128]
:-)
Ashley
22-Feb-2006
[129]
New revision available at: http://www.dobeash.com/SQLite/sqlite.r
Pekr
22-Feb-2006
[130]
thanks a lot :-)
Oldes
28-Feb-2006
[131]
It's designed to create the database in a Rebol's root dir? (if the 
path is not fully specified) I would expect to create the database 
in the folder where I'm (and where is the SQLite)
Pekr
28-Feb-2006
[132x3]
hehe, forget that ....
not sure ... it was really strange ... it created dbs at path of 
dll or so, dunno now ...
but I think that it is easy by simply setting db-path: your-dir-here 
... and then connect join db-path %my-db-name.db
Ingo
1-Mar-2006
[135x2]
While testing I started a script using sqlite several times, at about 
the 17th call I get this error:

** Script Error: Library error: Max Callbacks
** Where: context
** Near: sqlite-trace: make routine! [
    db [integer!]
    clb [callback! [int string!]]
    ptr [integer!]
] SQLite3lib

I'm using the following version:

	Title:		"SQLite driver"
	Owner:		"Ashley G. Trüter"
	Version:	0.1.3
	Date:		22-Feb-2006
And another question ... I have a table which conains string data 
...

sql {select * from person where firstname = "Ingo"}


does not find the row (tried with singel quotes, too). What am I 
doing wrong? It goes without saying, that the data is actually there, 
and I can find it using the following SQL

sql {select * from person where firstname like "%ingo%"}
Ashley
1-Mar-2006
[137]
REBOL supports a maximum of 16 callbacks; so to avoid this error 
don't do %sqlite.r more than once within a script (and there is no 
sensible reason to do so anyway).


As for strings, remember that the driver mold's them; so they are 
actually stored as "string" (inclusive of quotes). You can reference 
them in one of two ways:

	sql {select * from table where col = '"string"'}

or

	sql ["select * from table where col = ?" "string"]


The second form is preferred as the binding is handled for you by 
the driver. I'm in the process of writing a SQLite Driver Guide (that 
covers this and other tricks) but it's a week or two away at the 
moment.
Ingo
2-Mar-2006
[138]
Thanks Ashley,

I found out the first point by myself, and just added a check whether 
'sql has already been set before doing sqlite.r.

But so far I had no idea about the molding of strings, I could have 
tried for months, I guess ;-)
Ingo
4-Mar-2006
[139]
Hi Ashlsy, I found a bug in sqlite.r. 'sql reuses its own return 
value on subsequent calls. so ...

>> all: sql "select * from persons"
== [["Ivo" "Hohmann" ...] ....]
>> me: sql {select * from persons where firstname = '"ingo"'}
== [["Ingo" "Hohmann" ...] ...]
>> all =? me
== true
Ashley
4-Mar-2006
[140]
Deliberate design that. The last line of 'sql is simply:

	buffer

not:

	copy/deep buffer


This is important when dealing with a large number of values as you 
want to pass a reference not double the amount of memory used with 
a redundant copy/deep! I'll add this "gotcha" to the documentation 
I'm writing.
Ingo
5-Mar-2006
[141]
Actually, there is no need to copy/deep buffer.
Just change 
	clear buffer
to
	buffer: copy []

there is no problem with integer, decimal, and none values regarding 
sharing.
Blob data is debased, which implicitly creates a new string.

Strings are normally loaded, which creates a new string. only when 
you use /raw, you are dependend on the sqlite.dll having a sane interface 
and not reusing the returned string data. You could add this as a 
possible gotcha.
Ashley
5-Mar-2006
[142]
clear buffer

 is also an optimization as it defaults to 32K values (make block! 
 1032 * 32) and I don't won't to reallocate it each SQL call. The 
 following benchmarks (Transactions Per Second) give some backround 
 to the design decisions I made:

	buffer				1744718
	copy buffer			282
	copy/deep buffer		76

	clear buffer			1144733
	buffer: copy []			824352
	buffer: make block! 32768	387


So the approach I took optimizes for large result sets by allocating 
a large buffer once up-front and then just referencing it thereafter.