r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[SQLite] C library embeddable DB .

Pekr
15-Feb-2006
[86]
anyway - that's all I can do about it - to dislike it :-) I expect 
they went with one file because of locking issues in FS ....
Ashley
15-Feb-2006
[87]
As I mentioned near the beginning of this thread, SQLite supports 
multiple database files each containing one or more tables - in fact 
they go so far as recommending that you separate multiple high-access 
tables out into different databases for concurrency reasons. In this 
sense, SQLite "databases" act more like traditional "tablespaces". 
So, if we wanted we could write our REBOL front-end so that it created/accessed 
each table in a database of the same name thus ensuring a one-to-one 
mapping between table names and database names. The advantages of 
this approach are:

	backups (only those tables that change need be backed up)

 external table administration (you can drop a table by deleting its 
 database file)

 concurrency (you spread your file locking across a greater number 
 of physical files)

Disadvantages:


 Administering your database is more cumbersome (you can't use the 
 sqlite3 admin tool to administer all tables in one session)

 Value of sqlite_master is diminished (you can't "select * from sqlite_master" 
 to report on all your tables in one query)

 Query references need to add a database prefix when referring to 
 a table not in their own database

 Name conflicts (all tables in one file means multiple databases can 
 use the same table names - the solution with multiple files would 
 be to segregate at the directory level)

 Multiple database files means you need to zip them prior to some 
 operations such as email attachment, etc


On balance, I actually prefer the one file / one database approach.


Pekr's other comments in relation to schema implementation also have 
merit (I've agreed with Pekr twice today - a new record!); I see 
the value of an ftp schema, an http schema, etc; but what value in 
a sqlite schema? Given that the entire schema can be written in a 
much more concise fashion as an anonymous context that exports a 
couple of key access functions to the global context; I can't see 
what the functional differences between the two implementations would 
be?


So, bar any good reasons to the contrary, these are the features 
of the implementation I am currently working on (a rough design spec 
if you like):

	Implemented as an anonymous context

 "Database" is a directory (which is specified when a database is 
 opened with 'open-db)

 Each table resides in a "tablespace" (aka SQLite database file) of 
 the same name
	File is automatically opened on first reference

 The /blocked refinement of 'db-open specifies that rows will be returned 
 in their own block (default is a single block of values)

 Non-numeric values (which SQLite stores natively as INTEGER and REAL) 
 will be subject to 'mold/all on insert and 'load on retrieval

 The /native refinement of 'open-db will turn this behaviour off (see 
 comments below)

 SQLite binding will be supported allowing statements such as ["insert 
 into table values (?,?,?)" 1 [bob-:-mail-:-com] "Some text"] and ["select 
 * from table where email = ?" [bob-:-mail-:-com]]


Whether to store values (including string!) as molded values in SQLite 
is an interesting question; on the one hand it gives you transparent 
storage and access to REBOL values – but at the performance cost 
of having to mold and load every TEXT value returned; and the storage 
cost of the overhead of a molded representation. On the other hand, 
if I only want to store numbers and strings anyway then I don't want 
this overhead. I think the only practical solution is a /native type 
option as detailed above.
Pekr
16-Feb-2006
[88x10]
OK, need to leave to work, just a note - I think we can forget having 
one-table-per-file, as it is simply a hack with sqlite. Version prior 
to 3.0 even had problems with transactions in that regard and even 
3.0 has stupid compile limitation (like we have with callbacks) to 
something like 16 separate table-files, so ...
as for your new version - looking forward to it. Maybe we could vote 
a bit for what is default behavior and fo naming conventions .... 
when I saw /blocked, I first thought something about blocking copy, 
waiting behavior ....
e.g. Bobik prefers 'blocked mode as a default, as he can directly 
pass it to grid (Henrik and Cyphre's too IIRC)
for 'foreach loops, flat result is probably better. I also liked 
the ability of /names, which returned records as objects, so someone 
could do foreach rec results [print [rec/name rec/last-name rec/age]]
I would probably think of /as-blocks /flat, /as-objects, or /res-blocks, 
/res-objects, /res-flat, dunno ... open to discussion ...
and somehow - I liked the ability of RebDB to define dbs in a simple 
way - you just named columns and did not have to care about types 
even ....
I am also used to rebol blocks. Somehow using string mode for constructing 
sql query is one level off for me, but otoh maybe better, because 
with block mode you sometimes don't cover all syntax options or you 
simply has to learn how to use it, whereas with string you simply 
compose default sql query and you can cut and paste examples or queries 
from docs, visual tools, etc.
as for visual tools - just suggest me one. I run thru installation 
of most of them. I did not find ANYTHING in a quality of mySQL Admin 
and mySQL Query. Thwo of them I liked are commercial, so imo sqlite, 
as a database of the year, is really badly supported here ....
Now at work - one more note, Ashley - my typical app design is as 
follows - application directory containing exe or modules, .cfg files, 
\system subdirectory, which contains other subdirectories as \cyphre-styles, 
\rebgui, \rebdb, \sqlite, according to usage, and \data subdirectory 
... so I am not sure that by default the driver should create the 
directory, as in the contrary when I want simply to have my data 
in \data dir, I don't want the driver to automatically create \my-db-name 
subdir ....
what is more - subdire has some meaning, if sqlite would use one 
table per file design, not sure here, I am opened to thoughts ....
Ashley
16-Feb-2006
[98]
I've given up on the one table per database idea as the default attachment 
limit is 10 files. On the positive side, the ATTACH command seems 
to work properly under 3.0+ and table names unique to an attached 
database do not have to be prefixed in queries. ;) My 'connect function:

>> help connect
USAGE:
    CONNECT database /create /attach databases /blocked /direct

DESCRIPTION:
     Open a SQLite database.
     CONNECT is a function value.

ARGUMENTS:
     database -- (Type: file url)

REFINEMENTS:
     /create -- Create database if non-existent

     /attach -- Attach up to 10 databases (of type file!) to this connection
         databases -- Database files (Type: block)
     /blocked -- Return each row as a block
     /direct -- Do not mold/load REBOL values

lets you do this:


 connect/attach %/c/app/data/system.db [%/c/app/data/users.db %/c/app/data/reports.db]
	sql "select * from a-system-table"
	sql "select * from a-user-table"
	sql "select * from a-report-table"


which could be quite useful in many situations. The default, however, 
is now one database.


BTW, my rewrite (even after adding a lot more functionality) is about 
twice as fast as the original protocol version.
Pekr
16-Feb-2006
[99x3]
huh - when can we expect beta? I am first one who wants to test :-)
I have no opinion on molded values issue .... maybe we could have 
something like /custom, with a dialect, or not so complicated, just 
/molded refinement, where you define which columns to mold - if that 
makes sense and would actually speed anything up?
other thing is, if we should support /object as original scheme did? 
Even  with odbc, some time ago, I simply created map-record function, 
which mapped record to object, for easier access (block position 
independent) .... dunno if you find that possibility usefull though 
....
Sunanda
16-Feb-2006
[102]
<<SQLite "databases" act more like traditional "tablespaces">>
That's a nice flexible approach.

It may add an apparent unnecessary level of complexity for small 
databases, but the tablespace approach is intended to scale almost 
unlimitedly.

Think to when REBOL has taken over the world, and we have tables 
that exceed a single disk drive. Tablespaces exist (in part) to handle 
that sort of issue.
Pekr
16-Feb-2006
[103]
what is tablespace?
Sunanda
16-Feb-2006
[104]
It's a data space that contains tables.

A tablespace can be split across disk volumes (or servers or machines)
And a table is defined as residing in one or more tablespaces.


The unit of back up is a tablespace -- or a database (which consists 
of 1 or more tablespaces)

It adds a lot of flexibility for large systems. But can be overkill 
for smaller ones....Where you probably just have one tablespace that 
lives in one folder.
Ashley
16-Feb-2006
[105]
OK, initial beta release can be found at: http://www.dobeash.com/files/SQLite.r
Pekr
16-Feb-2006
[106x6]
ok, please inform us here, if you do some changes/fixes :-) Thanks 
a lot ...
Ashley - maybe I vote for functions having sql- or sqlite- prefix 
.... sql then could become sql-query .... because ... standalone 
"disconnect" in rebol script is a bit strangely looking .... in rebol 
you can disconnect from many things ... it does not look obvisous 
at first sight .... what do others think?
also open, close vs connect, disconnect ... evne rebdb has db-open 
.... looks good ... maybe db- would be nice, telling and short prefix 
...
hmm, now I found oconnect refinements uncomfort ....
I would prefer setting /blocked or not, /direct or not, as a result 
fo query, so maybe those refinements should be available with /sql 
function instead?
could we mark this group [web-public]? Would anyone be offended?
Ashley
16-Feb-2006
[112]
Done. I've also updated SQLite.r to correct its mold/all handling 
and changed the /blocked refinement to /flat (the driver now returns 
rows as blocks by default).
Ashley
17-Feb-2006
[113x2]
Bumped the version to 0.1.2 with three enhancements:

1) Automatic NULL -> none! conversion (and vice versa)

2) 'sql now handles a SQLITE_BUSY return code by retrying the failing 
sqlite3_step 5 times at one second intervals

3) About value (number of retries) can be set with a new connect/timeout 
refinement
About -> Above
Pekr
17-Feb-2006
[115x7]
Ashley - why did you name somehow not traditionally functions as 
connect and disconnect? To not clash with open and close natives? 
:-)
Ashley - why can't we support multiple name-spaces?
can we have opened only one db at a time? And if we want to open 
other, do we have to attach, no other chance?
Would it also be possible to extend e.g. 'headings to return non-string 
 values? (but I wonder if heading can contain space, if so, that 
is bad) .... because it would allow for following contstructs:

cols: headings
== [ID Name Address]

values: [1 "Petr" "Czech Republic"]

set cols values
I also don't agree with exposing functions as you did it, without 
prefix of sqlite or db .... that aproach is flat ... and unless it 
is not forbidden by sqlite itself, I don't see a reason to limit 
ourselves to have just one openened database .... I will shortly 
try it ...
I assigned whole anonymous context to sqlite: , to have access to 
sqlite/dbid, which can be submitted to other library wrapper functions 
to try some stuff. In current state, 'connect returns true or false, 
I want handler though ...
maybe it would be better to introduce one more function - if connect? 
db1: connect %my-db.db
sqlab
17-Feb-2006
[122]
It seems you want cursors.

Then you can separate preparing of an sql statement and fetching 
from the cursor.

This would allow to reset an already compiled statement and to redo 
 do it without a new compilation.
Or even to cache compiled statements as Caché does it.
Pekr
17-Feb-2006
[123x2]
I am not sure I want cursors - I just want to be able to work with 
multiple databases, without the need to close prior one ....
the problem imo is, that current driver does not return db handler 
anymore, but only true or false ...
Ashley
17-Feb-2006
[125]
1) why did you name somehow not traditionally functions as connect 
and disconnect?


open and close are already taken and I wanted to avoid pathing (sqlite/open) 
or prefixes (sqlite-open) as the number of words added to the global 
context is relatively few and I wanted them to be meaningful words 
in their own right (same reason why VID doesn't have vid-layout, 
vid-center-face, etc). Connect and disconnect are the commands used 
to login/logout of another user in many databases; so while the context 
is not strictly correct, they are recognisable DB words.

2) why can't we support multiple name-spaces?


You can via the connect/attach refinement (and this lets you transparently 
reference tables across multiple "databases"). Why would you want 
to concurrently access multiple databases that are *not* related 
to each other in any way?

3) can we have opened only one db at a time?


Yes, bar /attach as above. The benefit of this approach is that you 
minimise the number of file locks a single process obtains, and you 
don't have to track state information (i.e. you don't need to tell 
'sql which DB you are referring to).


4) Would it also be possible to extend e.g. 'headings to return non-string 
 values?

No, as it may contain strings like "count(*)"

5) don't agree with exposing functions as you did


I tend to write CONNECT, SQL, etc in upper-case so db references 
embedded in my code stand out. Come up with a better naming scheme 
that everyone likes though and I'll adopt it. ;)
Pekr
17-Feb-2006
[126]
ad 1) I was referring to the context of foreign script reader understanding 
- you used db-open even for rebDB - as I said, it is a detail, but 
makes sometimes things more clearer ...


ad 2) I was imagining simply kind of server product, which can open 
multiple unrelated databases ....


ad 3) kind of disadvantage here to not refer by pointer. We are used 
to it, no? Look at /Command or other schemes - db: open some-db://..... 
conn1: first db conn2: first db ... db2: open some-db://..../other-db

ad 4) OK


ad 5) db- or sqlite- prefix or let it the way it is, I will assign 
it to sqlite: context [] directly in %sqlite.r
Ashley
17-Feb-2006
[127]
you used db-open even for RebDB

 ... that's because it needed to define about 20 low-level SQL statement 
 functions like 'select, 'delete, ''show, 'close, 'insert, etc that 
 conflicted with existing REBOL words and could not be easily renamed 
 (folks expect a SELECT statement, an UPDATE statement, etc or something 
 that resembles it). With the SQLite driver, all these statements 
 are handled by the SQLite library itself so we only need to provide 
 a few high-level accessor functions like SQL and DESCRIBE; these 
 types of words don't conflict with REBOL's lower-level atomic action 
 type words so there isn't the same need to group and isolate them 
 by prefix and/or context.


kind of disadvantage here to not refer by pointer. We are used to 
it, no?

 But is it *needed*? If you *must* work with multiple databases concurrently 
 and *cannot* have them attached, then SQLite is probably not the 
 right solution for you anyway ... it's an embedded C library with 
 limited concurrency and no user rights management after all. ;)
Pekr
17-Feb-2006
[128]
:-)
Ashley
22-Feb-2006
[129]
New revision available at: http://www.dobeash.com/SQLite/sqlite.r
Pekr
22-Feb-2006
[130]
thanks a lot :-)
Oldes
28-Feb-2006
[131]
It's designed to create the database in a Rebol's root dir? (if the 
path is not fully specified) I would expect to create the database 
in the folder where I'm (and where is the SQLite)
Pekr
28-Feb-2006
[132x3]
hehe, forget that ....
not sure ... it was really strange ... it created dbs at path of 
dll or so, dunno now ...
but I think that it is easy by simply setting db-path: your-dir-here 
... and then connect join db-path %my-db-name.db
Ingo
1-Mar-2006
[135]
While testing I started a script using sqlite several times, at about 
the 17th call I get this error:

** Script Error: Library error: Max Callbacks
** Where: context
** Near: sqlite-trace: make routine! [
    db [integer!]
    clb [callback! [int string!]]
    ptr [integer!]
] SQLite3lib

I'm using the following version:

	Title:		"SQLite driver"
	Owner:		"Ashley G. Trüter"
	Version:	0.1.3
	Date:		22-Feb-2006