r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!RebDB] REBOL Pseudo-Relational Database

Cyphre
11-Feb-2010
[215]
Ashley, if you operating on binary! as a storage I think it would 
be possible to easily redirect the DB engine  to a file! port opened 
in binary mode no?
amacleod
11-Feb-2010
[216]
does that mean disk based possible?
Ashley
12-Feb-2010
[217x2]
Yes. I deliberately designed it so that the index binary (which ideally 
should always be memory resident) is as small as possible and that 
the data binary only requires seek and append operations (which are 
well suited to disk access). I need to do a few more benchmarks, 
but the next release should include a "data on disk" option.
Oops, just noticed I'd left the ";type:'module" header entry commented 
out ... fixed.
Pavel
15-Feb-2010
[219x3]
Ashley would you be so nice and write a little bit about indexing 
in RebDB? Do it work automatically  for all columns, or may the indexed 
columns be presetted? what in memory representation do you use (map, 
list, block?). Is indexing done automatically during insertion, or 
is it indexed by search? THX IA
Ie I understand from documentation basic index contains offsets to 
each respective record and is coded as binary, but where are the 
indexes used for quick search on columns?
May the hashes be used for record numbers? ie large non monotones 
integers?
Ashley
15-Feb-2010
[222]
http://www.dobeash.com/RebDB/rebdb3.html#section-4.6gives a brief 
but complete idea of how indexing works. The low-level implementation 
is quite simple:


 - The index is a binary! with each n bytes (n = idx-size, defaulting 
 to 2) representing on offset into the data binary! (start of row).

 - The index is sorted by/with a number of columns (left to right) 
 equal to the number of key columns (minimum of 1).
	- Updates and inserts that break this order set a reindex flag.

 - Lookup and seek both check the reindex flag prior to searching 
 the index binary! using a binary search ( http://en.wikipedia.org/wiki/Binary_search_algorithm
 ).

 - Lookup returns a single row (where # key values = # key columns)

 - Seek uses Lookup to return a range of values (where # key values 
 <> # key columns) and then does a linear match on each row


I'll devote a bit more time/space in the documentation to flesh this 
out with examples.
Pavel
15-Feb-2010
[223]
Thank you Ashley for brief description.
Pavel
17-Feb-2010
[224x2]
I've got strange results, don't know if intended:
do %rebdb.r
table: db-load %test.bin
db-select table [][]
== [1 1 "Name-1-1" 1 2 "Name-1-2" ...   ; this is OK

db-select table [2][]
== [1 2 3 4 ... ;this is OK

db-select table [2 3][]
== [1 "Name-1-1" 2 "Name-1-2" ... ;this is OK

db-select table [2 3][1 = c1]

== [1 1 "Name-1-1" 1 2 "Name-1-2"  ;first column should not be selected 
into result

db-select table [3][1 = c1]
== [1 #{0101} "Name-1-1" 1 #{0102} "Name-1-2" ... ;weird
Is it possible to put in more than one condition?
Ashley
18-Feb-2010
[226x2]
1) Couple of good bugs there ... fixing for 3.0.2.

2) Yes. Use all and any to group multiple conditions (e.g. [all [c1 
> 2 c2 < 7]] )
RebDB 3.0.2 uploaded (same access instructions as before).

Documentation updated ... http://www.dobeash.com/RebDB/rebdb3.html


This release renames several db-* functions more in line with REBOL 
naming conventions (e.g. db-insert renamed to db-append as rows are 
in actual fact appended not inserted).


Overall performance improvements (tested against a 1 million row 
table):

	- db-append 1.2x faster
	- db-append/nocheck option 8x faster than db-insert
	- sort-idx 2x faster
	- db-select with REBOL conditions 1.2x faster


It's going to get seriously interesting once ports are working properly 
and I can start benchmarking (and optimizing) disk and URL based 
direct access.
Pekr
18-Feb-2010
[228]
what do you mean by ports working properly?
Andreas
18-Feb-2010
[229]
pekr, i guess ashley refers to http://www.rebol.org/aga-display-posts.r?post=r3wp771x652
Pavel
18-Feb-2010
[230x5]
Great news! thanks for upgrade. What about promissed file storage, 
i.e. is ti possible to operate the DB and index binary directly to/from 
file?
ti=it :)
Anyway multiple condition is really cool I recommend to everybody 
attention, selection works as expected now thanks!
Really neat is you may create as many index columns as you need/want 
and then you can use only those ones what makes sence for you selection 
parameters. I.e. you can use multiple indexes/couners above one dataset 
and then use multiple selection for BETWEEN statement for example.
I've got only one exception. If I need different schema for each 
row (like in DevChat for example) I have to use Binary dat ecapsulation 
and sniff in binary  in second level. But multiple index columns 
helps alot in this.
Ashley
19-Feb-2010
[235]
Exactly. RebDB3 is optimized for "left to right" equality checks 
... which in my experience is the majority access pattern. When you 
have more than one key you are actually imposing a grouping heirarchy 
upon the data (e.g. a 3 column key of Country, State/Province/Region 
and City for example).
Pavel
23-Feb-2010
[236]
Ashley I think the name Index is little bit missleading the more 
appropriate would be offsets.  Anyway real indexes (ordered lists), 
are you building them on demand each new selection? Ie it is rebuilded 
on each selection or there are not used at all and each selection 
traverses thru data one of three methods?

Or maybe make the question easier is it possible to create and save 
true index on column?
Ashley
24-Feb-2010
[237]
The index [of offsets] is created by reading a number of columns 
(min 1) equal to key-cols and inserting both the composite row and 
the offset into a block which is then sorted by the composite row. 
For eaxmple:


Say we have 2 columns of integers in the index and also assume we 
have two rows.

Reading:

	3 2 ...
	2 3 ..

might create an intermediate block like:

	#{0034} #{01030102}	; note the #{01} field length indicators}
	#{0056} #{01020103}

which is sorted to produce an index of:

	#{00560034}


Index is updated as rows are inserted or deleted. Some insert and 
update operations will set a reindex flag which is acted upon the 
next time a lookup or seek is performed.


This index is maintained automatically by RebDB ... you cannot "create" 
one in the traditional sense of the word. The weakness of this scheme 
is it assumes that key access is by column 1 OR column 1 & column 
2. It doesn't handle the situation where you need key access to column 
1 OR column 2 (which in my experience is a fairly uncommon case).
Pavel
24-Feb-2010
[238x2]
OK thanks for description
Anyway I'like to ask again about direct file storage of dat/index 
parts (file based database)  is it possible?
Ashley
24-Feb-2010
[240]
Yes. It's fairly easy to access the dat file on disk via a port! 
... as long as some port! bugs are corrected. Once ports are working 
correctly I'll add file support.
Pavel
24-Feb-2010
[241x3]
I'm looking forward to it
I've tested the length? function over port after inserting the port 
and it works well now, is there other bugs in fileports?
Sorry Update on port writes cannot update on port
Sunanda
8-Nov-2010
[244]
RebDB question on REBOLforum:

    http://www.rebolforum.com/index.cgi?f=printtopic&topicnumber=46
GiuseppeC
13-Nov-2010
[245x2]
Maybe it's late and my eyes are not working properly.
I am starting to use REBOLDB for REBOL2.0.

I have read the quickstart guide and now I know how to create a table 
but.. wait: How do I open the table at the next start of my script 
?
I cannot find a DB-OPEN function.
Also, I need to store DATE and TIME together and select rows greater 
than a DATE and TIME value. Does RebolDB support this ?
GiuseppeC
14-Nov-2010
[247]
Any HELP for my problems ? I am stuck with a simple application.
Sunanda
14-Nov-2010
[248]
I'm not a rebDB expert....So take my answers as something to be tested:

-- I don't think there is an explicit OPEN command. Simply access 
the table (with SELECT, INSERT, etc) to open it.  Similarly, no CLOSE 
-- use DB-COMMIT to ensure caches are flushed.

--rebDB supports REBOL datatypes. Have you tried creating a table 
row with a datetime in it? Did you have a problem?
GiuseppeC
16-Nov-2010
[249x2]
Sunanda: you are right. DB il already available without opening it.
Also the function TO-DATE is able to convert from "15 nov 2010/11:30:15" 
to "15-nov-2010/11:30:15" without having to parse the string. Something 
not noticed from REBOL documentation.
jack-ort
15-Apr-2011
[251]
(From Ashley on 5-Feb-10:)

All new RebDB v3 released for REBOL3. To take it for a spin, try 
this:

import http://idisk.me.com/dobeash/Public/rebdb.r


when I try this I get an access error - reason: "not found or not 
valid"


Has RebDB v3 moved to a new location ?  Thanks!  Cannot wait to try 
it out.
OneTom
15-Apr-2011
[252]
Ashley: http://idisk.me.com/dobeash/Public/rebdb.rgives this error 
message: Account Error: Missing Data Volume
Ashley
15-Apr-2011
[253]
I've moved from MobileMe to DropBox since then.

import http://dl.dropbox.com/u/8269768/rebsm.r
help db-
test: db-load http://dl.dropbox.com/u/8269768/test.bin
help test
sql select test


I'm in the process of rebuilding my Dobeash site and will be adding 
this (and a few other goodies) along with documentation when done. 
No ETA, but it can live on dropbox until then.
onetom
16-Apr-2011
[254x11]
thx, Ashley
what is the advised server / client setup? where should be the source, 
the data directory and the login.sql files for the server and the 
client?
and the spool file... which is suggested to be called session.log 
but then it will be replayed by the commit * in login.sql if it's 
in the data directory..
if exists? login: join system/script/parent/path %login.sql [ client/run 
login ]

this would make more sense to me in SQL.r, so any directory with 
a login.sql could be either a server or client directory
it would also make more sense to call it init.sql and provide a client.sql 
and a server.sql as examples in the distribution
if im enabling the spool feature, i just get this error:
onetom ~/rebol/db $ rebdb
Replaying 9 change(s).
** Script Error: Out of range or past end
** Where: func [/local log][
...
the only way to restart the server is to
onetom ~/rebol/db $ rm ../rebdb/replay.bak ../rebdb/session.log
if i do a commit *, then the replay.bak won't exist, but the server 
will still try to execute the session.log, which results in fractional 
changes:
onetom ~/rebol/db $ rebdb
Replaying 4.5 change(s).
** Script Error: Out of range or past end