Compression
[1/36] from: ptretter:charter at: 17-Apr-2001 11:14
Coming soon! Image putting greater than 3600 CD's worth of data on a floppy disk. Or
take a Gigabyte worth of data and compress it to under 400bytes. I purchased /View/Pro
and will most likely purchase runtime licenses once they are available for /View and
begin distribution of NEW compression software depending on the licensing terms available.
Imagine with 400 byes of compressed data (representing a gig or more) what this could
mean to handheld devices like portable mp3 players or digital cameras. The capablity
to store your entire mp3 collection in a portable device. I know your interested but
you will have to wait a bit longer. Thanks to the guys in the IRC REBOL channel on EFNET
for your help and cooperation (you know who you are).
Paul Tretter
[2/36] from: depotcity:telus at: 17-Apr-2001 9:38
I find this a bit hard to swallow.
2,340 Gigabytes on one floppy?
Pull that one off SUCCESSFULLY and I'll see you get the Nobel Prize.
T Brownell
----- Original Message -----
From: "Paul Tretter" <[ptretter--charter--net]>
To: "[Rebol-List--Rebol--Com]" <[rebol-list--rebol--com]>
Sent: Tuesday, April 17, 2001 9:14 AM
Subject: [REBOL] Compression
> Coming soon! Image putting greater than 3600 CD's worth of data on a
floppy disk. Or take a Gigabyte worth of data and compress it to under
400bytes. I purchased /View/Pro and will most likely purchase runtime
licenses once they are available for /View and begin distribution of NEW
compression software depending on the licensing terms available. Imagine
with 400 byes of compressed data (representing a gig or more) what this
could mean to handheld devices like portable mp3 players or digital cameras.
The capablity to store your entire mp3 collection in a portable device. I
know your interested but you will have to wait a bit longer. Thanks to the
guys in the IRC REBOL channel on EFNET for your help and cooperation (you
know who you are).
[3/36] from: mat:eurogamer at: 17-Apr-2001 17:54
Heya Paul,
PT> Imagine with 400 byes of compressed data (representing a gig or
PT> more) what this could mean to handheld devices like portable mp3
PT> players or digital cameras. The capablity to store your entire mp3
PT> collection in a portable device.
Is it especially good crack where you come from?
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[4/36] from: ptretter:norcom2000 at: 17-Apr-2001 12:04
Actually I expected that kind of remark. :) I also expect more of the same
and looked in the mirror several times asking similiar questions.
Paul Tretter
[5/36] from: ptretter:charter at: 17-Apr-2001 12:01
Wow that sounds like a big number. Actually though its more like 65
terabytes given some of REBOL's limitations with number size but feel thats
acceptable. :)
Can I hold you to that Nobel Peace prize? ;)
Actually, economics and time play the most inportant part in the amount of
compression needed. For example it would take very long time to uncompress
2 gig worth of data from 400 bytes. That may not be a the best means of
data handling depending on the situation. However, I for example put an
entire CD image on a website (compress to less than 400 bytes) then download
it over a dialup line and uncompress it very easily with very little possibe
risk of corruption due to the small footprint. I would be skeptical also of
this and had doubts and many road blocks. However, it can and will be
available soon. Thanks for the comments.
Paul Tretter
[6/36] from: ryanc:iesco-dms at: 17-Apr-2001 10:17
Paul,
I am sure you can understand that most people have a hard time believing you, as you
may have guessed. If I did not know you, I would be be laughing at you. Bypassing argument
of whether or not you are correct, how far have you gotten? What sort of time does it
to compress something? Generally speaking, how does it work?
BTW, make sure to fill out a patend disclosure immediately!
--Ryan
Paul Tretter wrote:
> Coming soon! Image putting greater than 3600 CD's worth of data on a floppy disk.
Or take a Gigabyte worth of data and compress it to under 400bytes. I purchased /View/Pro
and will most likely purchase runtime licenses once they are available for /View and
begin distribution of NEW compression software depending on the licensing terms available.
Imagine with 400 byes of compressed data (representing a gig or more) what this could
mean to handheld devices like portable mp3 players or digital cameras. The capablity
to store your entire mp3 collection in a portable device. I know your interested but
you will have to wait a bit longer. Thanks to the guys in the IRC REBOL channel on EFNET
for your help and cooperation (you know who you are).
>
> Paul Tretter
>
> --
> To unsubscribe from this list, please send an email to
> [rebol-request--rebol--com] with "unsubscribe" in the
> subject, without the quotes.
--
Ryan Cole
Programmer Analyst
www.iesco-dms.com
707-468-5400
I am enough of an artist to draw freely upon my imagination.
Imagination is more important than knowledge. Knowledge is
limited. Imagination encircles the world.
-Einstein
[7/36] from: ryan:christiansen:intellisol at: 17-Apr-2001 12:26
Well, best of luck! At least make C|Net aware of your compression utility
and give REBOL lots of credit.
:)
Paul wrote...
However, I for example put an
entire CD image on a website (compress to less than 400 bytes) then
download
it over a dialup line and uncompress it very easily with very little
possibe
risk of corruption due to the small footprint. I would be skeptical also
of
this and had doubts and many road blocks. However, it can and will be
available soon. Thanks for the comments.
Ryan C. Christiansen
Web Developer
Intellisol International
4733 Amber Valley Parkway
Fargo, ND 58104
701-235-3390 ext. 6671
FAX: 701-235-9940
http://www.intellisol.com
Global Leader in People Performance Software
_____________________________________
Confidentiality Notice
This message may contain privileged and confidential information. If you
think, for any reason, that this message may have been addressed to you in
error, you must not disseminate, copy or take any action in reliance on it,
and we would ask you to notify us immediately by return email to
[ryan--christiansen--intellisol--com]
[8/36] from: mat:eurogamer at: 17-Apr-2001 18:29
Heya Paul,
PT> Actually I expected that kind of remark. :) I also expect more of the same
PT> and looked in the mirror several times asking similiar questions.
Don't worry, the effects wear off after awhile.
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[9/36] from: ptretter:norcom2000 at: 17-Apr-2001 12:27
I can understand that its hard to believe but I struggled with this along
time. I couldnt sleep at times. I tried talking to math professionals
looking for an answer but couldnt get one. Slowely, I started to understand
what couldnt be done and scrapped it immediately. I also struggled with
number size and had to find alternatives. Finally, I found the solution
which is very unique compared to current compression methods. That was the
hardest part. Now the next pieces will be easy. To create the software to
perform the calculations. Once I get that piece in place then I can move
much more agressively with a prototype.
Paul Tretter
----- Original Message -----
From: "Ryan Cole" <[ryanc--iesco-dms--com]>
To: <[rebol-list--rebol--com]>
Sent: Tuesday, April 17, 2001 12:17 PM
Subject: [REBOL] Re: Compression
> Paul,
> I am sure you can understand that most people have a hard time believing
you, as you may have guessed. If I did not know you, I would be be laughing
at you. Bypassing argument of whether or not you are correct, how far have
you gotten? What sort of time does it to compress something? Generally
speaking, how does it work?
> BTW, make sure to fill out a patend disclosure immediately!
>
> --Ryan
>
> Paul Tretter wrote:
>
> > Coming soon! Image putting greater than 3600 CD's worth of data on a
floppy disk. Or take a Gigabyte worth of data and compress it to under
400bytes. I purchased /View/Pro and will most likely purchase runtime
licenses once they are available for /View and begin distribution of NEW
compression software depending on the licensing terms available. Imagine
with 400 byes of compressed data (representing a gig or more) what this
could mean to handheld devices like portable mp3 players or digital cameras.
The capablity to store your entire mp3 collection in a portable device. I
know your interested but you will have to wait a bit longer. Thanks to the
guys in the IRC REBOL channel on EFNET for your help and cooperation (you
know who you are).
[10/36] from: doug:vos:eds at: 17-Apr-2001 13:37
When can I buy a copy?
[11/36] from: mat:eurogamer at: 17-Apr-2001 18:42
Heya Paul,
PT> I tried talking to math professionals looking for an answer but
PT> couldnt get one.
None of them explained the laws of entropy to you then?
--
Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee
http://www.eurogamer.net | http://www.eurogamer-network.com
[12/36] from: depotcity:telus at: 17-Apr-2001 10:50
2000 megs = 400 bytes or
20 megs = 4 bytes
Even the word "vapour" as in "vapourware" is 6 bytes.
T Brownell
[13/36] from: ryanc:iesco-dms at: 17-Apr-2001 11:38
While I tend to think Paul is mistaken, take in mind that fractal
generators may be infinitely small compared to the data that they can
produce. 3 / 7 is liberally 3 bytes, how many megs of data can it
produce?
--Ryan
Terry Brownell wrote:
> 2000 megs = 400 bytes or
> 20 megs = 4 bytes
<<quoted lines omitted: 4>>
> [rebol-request--rebol--com] with "unsubscribe" in the
> subject, without the quotes.
--
Ryan Cole
Programmer Analyst
www.iesco-dms.com
707-468-5400
I am enough of an artist to draw freely upon my imagination.
Imagination is more important than knowledge. Knowledge is
limited. Imagination encircles the world.
-Einstein
[14/36] from: depotcity:telus at: 17-Apr-2001 11:55
That's like saying how many megs of data can Pi produce.
TBrownell
[15/36] from: ryanc:iesco-dms at: 17-Apr-2001 12:55
Terry Brownell wrote:
> That's like saying how many megs of data can Pi produce.
>
> TBrownell
Exactly my point. So far it seems to be an awful lot.
I would say the real problem is that 3 bytes can only contain about 1.7
million different combinations, so at maximum, only that number of
documents could be compressed using a truly optimum technique. Such a
technique would leave a vast number of documents in the world, and times to
come, uncompressable or requiring more than 3 bytes.
Several, several years ago I had come up with an compression scheme that
could be redudantly compressed over and over, so that you could simply just
keep on compressing until the document reached a miniscule size. After
writing the program I discovered the only problem was that almost
everything I compressed ended up larger than it was before I "compressed"
it. However it did make an interesting encryption program. :^)
--Ryan
[16/36] from: joel:neely:fedex at: 17-Apr-2001 16:56
It's a good thing this thread was posted to the REBOL mailing list
instead of a hard-core tech list like cypherpunks (at least before
it got covered over with spam). Those folks had NO patience with
technical faux pas nor naivet.
First, let's remember the difference between lossy and lossless
(de)compression.
Lossy compression schemes (e.g. JPEG) approximate original data
in a way that takes less data (i.e., increased compression ratios)
to achieve a poorer approximation. In other words, the more you
compress, the worse the reconstructed data compare with the
original. This works well (up to a point) with photos meant to
be viewed by humans, since we don't notice the noise of the
approximation as being too different from the normal background
texture of most images. But try to use JPEG on a simple "spot
color" graphic, and you'll see the effects VERY quickly.
Lossless compression schemes (e.g., RLE, LZW, etc.) operate by
finding patterns in the original data and replacing them with
what amounts to instructions that can be followed to reproduce
the patterns exactly. In general, lossless compression schemes
don't advertise the compression rates of lossy schemes, but that's
the price you pay for perfect reproduction (such as you MUST have
for executable code, for example).
As Ryan mentioned in another post, 3 bytes can only represent
16777216 distinct values. A quick calculation from the email I
am replying to (considering only spaces and letters) shows an
entropy of ~4.166 bits per character. That means that the set of
all possible 3-byte binary values could only code the set of all
possible ~5.76-character messages (made up of only space and
letters, conforming to the original source model).
Therefore, any lossless compression scheme over all messages in
this population will top out at about 48% savings.
Ryan Cole wrote:
> While I tend to think Paul is mistaken, take in mind that fractal
> generators may be infinitely small compared to the data that they
> can produce. 3 / 7 is liberally 3 bytes, how many megs of data can
> it produce?
>
It doesn't matter. Although the sequence "3/7" is a valid encoding
for the infinite message
0.42857142857142857142857142857142857142857142857...
(and therefore highly efficient ;-) I challenge you to find an
equally compact encoding for the highly-similar message
0.42857142857142857142857142857142857142857412857...
(yes, they are different, if you look closely enough). If both of
these messages are in the set of possible messages I need to be able
to encode, then the average cost of an "a/b" encoding begins to cost
more as the set of possible messages grows.
What makes the whole system cost even more is that you also have to
take the size of the (de)compression algorithm itself into account.
Consider that the absolute best possible compression technique
(averaged, again, over the entire set of messages capable of being
handled) would be to use a dictionary containing every possible
message. If all messages were equally likely, the best possible
compression would be to represent each message by its position in
the dictionary (in binary, of course).
Finally, I don't claim comprehensive knowledge, but everything I've
read about "fractal compression" makes it sound like a lossy
compression scheme.
-jn-
[17/36] from: timewarp:sirius at: 17-Apr-2001 14:38
don't knock it, genius is heaven born. i happen to know
that it is possible to design and build an npmg (near
perpetual motion generator) ... such things break all the
rules and do not conform to how anyone sees anything "today".
watch, this compression utility will work and it will be
fast.
cheerfulness and have faith in the impossible,
-----EAT
Mat Bettinson wrote:
[18/36] from: ryanc:iesco-dms at: 17-Apr-2001 16:32
Joel Neely wrote:
<snip>
> It doesn't matter. Although the sequence "3/7" is a valid encoding
<<quoted lines omitted: 3>>
> equally compact encoding for the highly-similar message
> 0.42857142857142857142857142857142857142857412857...
Not exactly equally compact, but how about this:
23 / 7 S 6
23 divided by 7 skipping the first 6 digits.
> (yes, they are different, if you look closely enough). If both of
> these messages are in the set of possible messages I need to be able
<<quoted lines omitted: 16>>
> [rebol-request--rebol--com] with "unsubscribe" in the
> subject, without the quotes.
I know it does not work, but I could see the expression on your face from
here. lol, lol, lol, lol.
I am suprised you didnt here my laughs. Wew! I will be getting a good
chuckle for awhile, thanks! Of course if I knew how to do such things
Joel, you would have heared about it on the 6 o'clock news. ;^)
Cheers,
--Ryan
Ryan Cole
Programmer Analyst
www.iesco-dms.com
707-468-5400
I am enough of an artist to draw freely upon my imagination.
Imagination is more important than knowledge. Knowledge is
limited. Imagination encircles the world.
-Einstein
[19/36] from: depotcity:telus at: 17-Apr-2001 18:16
Have faith in the impossible?
Next thing you'll be telling us is XML is the answer to the semantic web!
:)
TBrownell
[20/36] from: joel:neely:fedex at: 17-Apr-2001 17:12
Ryan Cole wrote:
> Joel Neely wrote:
> <snip>
<<quoted lines omitted: 10>>
> 23 / 7 S 6
> 23 divided by 7 skipping the first 6 digits.
...
> I know it does not work, but I could see the expression on your face
> from here. lol, lol, lol, lol.
> I am suprised you didnt here my laughs. Wew! I will be getting a good
> chuckle for awhile, thanks! Of course if I knew how to do such things
> Joel, you would have heared about it on the 6 o'clock news. ;^)
>
Well, sorry to disappoint on all three counts:
1) The correct answer is
30000000000000000000000000000000000000000189
--------------------------------------------
70000000000000000000000000000000000000000000
2) Your visual imagination is better than my auditory perception. ;-)
3) I won't be on PBS, as the producers of Nova were not sufficiently
impressed with the production of the above answer. :-(
The real point of this little exercise, however, may have gotten lost
in the humor. Consider the following two messages:
Dear sir, we find that we are so overwhelmed with your ability
to perform arithmetic with numbers of more than two digits that
we would be delighted to accept your gracious invitation, and will
plan to visit your residence for an interview at your earliest
convenience. To be brief, yes.
and
Dear sir, madam, or whatever the case may be: Our attorneys will
be contacting you to serve a court order requiring that you cease
and desist any further attempts to harrass our receptionist or
other employees with your constant requests that we devote air
time to the fact that you can do arithmetic. To be blunt, no.
If those are the only two messages possible, I can compress each of
them to a single bit! The same would be true if each message were
composed in triplicate, with the second copy in Anglo-Saxon and the
third in Mandarin (which, for the sake of convenience, we'll assume
to be represented phonetically in ASCII).
The number of bits has nothing to do with the "length" of the message,
but with the total number of messages that are possible, and with
their relative probabilities.
Therefore, if someone claims to be able to condense the content of
3600 CDs onto a single floppy disk, I'd respond that it's entirely
possible IF THERE ARE ONLY A RELATIVELY SMALL NUMBER OF POSSIBLE
CDs. For example, if there are only 4096 possible CDs, one can
code each one by a 12-bit value, regardless of how much data may be
on each CD. Sending a 3600 CD message would then require only 675
bytes. (Of course, decompressing that 675 bytes would require that
the recipient already have access to the content of a full set of
such CDs!)
OTOH, if a CD can contain 600Mb of arbitrary data, then a collection
of 3600 CDs equates to approximately 2.16 terabytes.
Any claim to be able to compress THAT collection of data down to
1.44Mb, if any CD is equally likely to appear, simply violates the
laws of physics.
It ain't gonna happen.
I mean no discouragement to Paul in his research into data compression
techniques. It's always possible to find special-case algorithms for
special-case data, and some of them have interesting applications,
such as JPEG compression of images and MP3 compression of music for
human consumption.
However, improved performance via a special-case approach always has
the cost of narrowing the range of its applicability and/or having
the effect you mentioned in an earlier post -- actually making data
outside its "target zone" grow significantly. What's interesting is
that there's actually a kind of conservation law here; the better it
gets inside the target zone, the worse it gets outside.
You can't win, you can't break even, and you can't get out of
the game.
-jn-
[21/36] from: dankelg8:cs:man:ac at: 18-Apr-2001 12:43
Strange.
Do you have a different calendar system where you live?
1st of April has long passed in this part of the world :)
Gisle
On Tue, 17 Apr 2001, Paul Tretter wrote:
[22/36] from: jeff:rebol at: 18-Apr-2001 8:53
Hello, EAT:
> don't knock it, genius is heaven born. i happen to know
> that it is possible to design and build an npmg (near
> perpetual motion generator) ...
You've aroused my curiosity... how NEAR to perpetual motion are we
talking? Like almost perpetual? Real close to perpetual? Just short
of perpetual? I mean, how much closer, would you estimate, to
perpetual motion can you attain with the device you mention than,
say, your typical NON-perpetual motion device (like a YO-YO, for
instance)? Like MUCH MUCH closer to perpetual?
And when you say "perpetual motion generator" I have to ask, what
does it generate perpetual motion for? How does the perpetual
motion come out of the generator?
> such things break all the rules and do not conform to how anyone
> sees anything "today".
Your use of quotes around the word "today" confuses me. Usually
quoting a word indicates something with questionable definition,
like a jargon term, or something that is qualified in its usage.
For example, I might say:
This near perpetual motion generator should work "forever".
> watch, this compression utility will work and it will be
> fast.
>
> cheerfulness and have faith in the impossible,
Certainly. By definition the impossible happens all the time! If
it weren't impossible then it wouldn't not not be possible!
-jeff
[23/36] from: m:koopmans2:chello:nl at: 18-Apr-2001 20:08
Hey Jeff,
Didn't know you were a physicist too :)
ooooops, blown my cover...
--Maarten
[24/36] from: louisaturk:eudoramail at: 26-Jun-2001 21:48
Hi everybody,
This works:
write/binary %/a/data.r compress read/binary %data.r
But this produces an error message immediately when I do the program:
write/binary %data.r decompress read/binary %/a/data.r
>> do %db.r
** Syntax Error: Missing [ at end-of-block
** Near: (line 2) ]@$ָuvy'EQ$\fb#
wyW,GT
print %data.r decompress read/binary %/a/data.r
works fine.
What is causing the error message?
Louis
[25/36] from: brett:codeconscious at: 27-Jun-2001 12:27
Hi Louis,
Two possibilities.
1) Check that you have not accidently compressed %db.r itself.
2) I note from one of your earlier script that the load-data function has
the following line:
data: load/all db-file
Which attempts to load without decompressing - maybe that is your problem.
Brett.
----- Original Message -----
From: "Dr. Louis A. Turk" <[louisaturk--eudoramail--com]>
To: <[rebol-list--rebol--com]>
Sent: Wednesday, June 27, 2001 12:48 PM
Subject: [REBOL] Compression
Hi everybody,
This works:
write/binary %/a/data.r compress read/binary %data.r
But this produces an error message immediately when I do the program:
write/binary %data.r decompress read/binary %/a/data.r
>> do %db.r
** Syntax Error: Missing [ at end-of-block
** Near: (line 2) Y]@$'ָuvyz'EQ$\fb#
wyzW,GT
print %data.r decompress read/binary %/a/data.r
works fine.
What is causing the error message?
Louis
[26/36] from: louisaturk:coxinet at: 26-Jun-2001 21:37
Hi everybody,
This works:
write/binary %/a/data.r compress read/binary %data.r
But this this produces an error message immediately when I do the program:
write/binary %data.r decompress read/binary %/a/data.r
>> do %db.r
** Syntax Error: Missing [ at end-of-block
** Near: (line 2) ]@$ָuvy'EQ$\fb#
wyW,GT
print %data.r decompress read/binary %/a/data.r
works fine.
What is causing the error message?
Louis
[27/36] from: louisaturk:eudoramail at: 27-Jun-2001 0:21
Hi Brett,
Thanks for the response.
It might be the second possibility. If I type the following at the command
prompt it works:
write/binary %data.r decompress read/binary %/a/data.r
If compress/decompress is used for every write, then performance will be
affected. I just want to compress the data so that more will fit on a
floppy disk for backup purposes. I don't want the database compressed
during ordinary usage. How do I get around this problem?
Louis
At 12:27 PM 6/27/2001 +1000, you wrote:
[28/36] from: brett:codeconscious at: 27-Jun-2001 14:55
Hi Louis,
> It might be the second possibility. If I type the following at the
command
> prompt it works:
>
> write/binary %data.r decompress read/binary %/a/data.r
Ok this is the real problem - you use the same name for the uncompressed
version as you do for the compressed version. Logically it works but in
practise it is not a good idea.
This is risky. You are increasing your risks of losing data. Why? Because
devices (computers) sometimes fail. It might be a floppy (quite prone to
failure) or your hard disk or a power shortage - whatever. When you are
transforming your data so significantly as in the compression/decompression
steps you are taking, you should write the transformed data to a new file.
While on risks, the Rebol User guide warns about storing data using
compression - if a couple of bits are changed by some stray cosmic
radiation or whatever - you're more likely to be unable to recover them.
> If compress/decompress is used for every write, then performance will be
> affected.
There's always a trade-off. With compression you are exchanging processing
power for disk space. You need to choose what is more important to you
taking into consideration the amount of information you are dealing with,
the patterns of your use, the equipment you will use it on, etc...
> I just want to compress the data so that more will fit on a
> floppy disk for backup purposes.
Fair enough. So just do the compression using another script when you want
to perform a backup. If you are using a different name there will be no
confusion.
Floppies are small on price and offer similar protection for backing up
data. Multiple backups of the same data would be a good idea. Depends on how
critical your data is. Oh and don't forget to check that you can read a file
from a floppy once you have written to it - always good to test your backups
actually are backups and not placebos.
> I don't want the database compressed
> during ordinary usage. How do I get around this problem?
The solution is simply to leave data.r uncompressed and use it as normal.
Then as mentioned above compress the information it contains when you need
to and save to a different name and possibly a different location.
Brett
[29/36] from: arolls:bigpond:au at: 27-Jun-2001 16:15
I noticed that the straight text compression
is generally better than the binary mode
compression - Try this in a directory
with text files:
foreach f read %./ [if not dir? f [
print [
f
length? read f length? compress read f
length? read/binary f length? compress read/binary f
]
]]
I saw for most files the second number
smaller than the fourth.
I had no trouble using your method to
'do compressed scripts.
I think Brett is right, and you have
overwritten your original filename at
some point, or the floppy data is a bit
wrong.
Overwriting a file is a common way
to lose data.
I was most happy to learn that rebol's
'rename function prevents you from
overwriting a file that exists already.
Anton.
[30/36] from: joel:neely:fedex at: 27-Jun-2001 1:38
Hi, Louis,
Try using the same file you just decompressed! ;-)
Dr. Louis A. Turk
wrote:
> Hi everybody,
> This works:
<<quoted lines omitted: 9>>
> works fine.
> What is causing the error message?
Your transcript shows me that
%data.r -(compress)-> %/a/data.r -(decomp)-> %data.r
after which you try to
do %db.r
which isn't any of the above files. What's in db.r???
-jn-
--
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com
[31/36] from: joel:neely:fedex at: 27-Jun-2001 1:47
Hi, Brett,
Brett Handley wrote:
> Hi Louis,
> > It might be the second possibility. If I type the following
<<quoted lines omitted: 4>>
> uncompressed version as you do for the compressed version.
> Logically it works but in practise it is not a good idea.
That would be true only if he had actually done
change-dir %/a/
prior to the line above. If REBOL still thought the current
directory were anywhere else, then the line above uses the
same name
but in a different directory. Thus it wouldn't be
the real problem.
Please see my response to Louis for what I believe to be the
real source of his difficulties.
-jn-
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com
[32/36] from: joel:neely:fedex at: 27-Jun-2001 2:04
Hi, Louis,
Dr. Louis A. Turk
wrote:
> It might be the second possibility. If I type the following
> at the command prompt it works:
<<quoted lines omitted: 4>>
> purposes. I don't want the database compressed during
> ordinary usage. How do I get around this problem?
I can do all of
>> write/binary %compd.rz compress read %compd.r
>> do decompress read/binary %compd.rz
>> timeall/run 10 10 ;a function defined in compd.r
;... normal output occurs
>> write/binary %foo.r decompress read/binary %compd.rz
without incident, after which diff shows compd.r and foo.r
to be identical.
Given a data file, as below
$ cat whatever.data
1234 "Ferd Burfel" 127.0.0.1 #901-555-1212
2345 "Joe Doaks" 127.255.255.255 #615.555.1212
3456 "Patrick Henry" 255.255.255.0 #206.555.1212
$
I can do all of the following without problems
>> write/binary %whatever.rz compress read %whatever.data
>> foo: load decompress read/binary %whatever.rz
== [1234 "Ferd Burfel" 127.0.0.1 #901-555-1212
2345 "Joe Doaks" 127.255.255.255 #615.555.1212
3456 "Patrick Henry" 255.25...
>> print foo
1234 Ferd Burfel 127.0.0.1 901-555-1212 2345 Joe Doaks
127.255.255.255 615.555.1212 3456 Patrick Henry 255.255.255.0
206.555.1212
Therefore, your core strategy could be
1) decompress the data after reading from file
2) modify the memory data as needed
3) compress the data and write back to file
Of course, this provides lots of failure modes:
a) an error could occur during (2)
b) the system could hang/crash during (2)
c) a problem during (3) could smash the data beyond repair
...etc...
Problems (a) and (b) could occur even if you were using an
uncompressed file. The consequence is loss of modifications
since last write.
Problem (c) could also occur, but you'd have more of a chance
of recovering some data by hand if the file were uncompressed.
The worst-case consequence is loss of all data.
Standard safety techniques include
i) backing up between sessions
ii) saving the data (whether compressed or not) between
modifications within a session
iii) writing a "journal file" entry for each modification,
within a session, combined with doing (I)
Option (i) is coarse-grained safety, with lowest cost. Option
(ii) is most costly in time, and even more so if compressing
with each save-to-file operation. Option (iii) is often a
reasonable compromise, but does require that you have a
recovery script that is able to take a stable image file and
replay
the journal entries (add/change/delete) to bring back
to the last checkpointed status.
YMMV, but I'd say that reinventing all of the functionality
and security of a real database engine from scratch is a
very non-trivial task. Good luck!
-jn-
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com
[33/36] from: agem:crosswinds at: 27-Jun-2001 14:36
RE: [REBOL] Re: Compression
[louisaturk--eudoramail--com] wrote:
> Hi Brett,
> Thanks for the response.
<<quoted lines omitted: 5>>
> floppy disk for backup purposes. I don't want the database compressed
> during ordinary usage. How do I get around this problem?
add some little bytes at the start which tells if file is compressed.
something like:
read-db: does[
data: read/binary %/a/data.r
parse/all data [copy type to ":" copy data to end]
if "compressed:" = type [data: decompress data]
]
write-db: does[
write/binary %data.r join "uncompressed:" data
]
write-db-compressed: does[
write/binary %data.r join "compressed:" compress data
]
write-db-compressed read-db
[34/36] from: brett:codeconscious at: 27-Jun-2001 22:40
> Please see my response to Louis for what I believe to be the
> real source of his difficulties.
Well you've got two on the list currently that I can see Joel.
(a) "Try using the same file you just decompressed!"
or (b) "...reinventing all of the functionality
and security of a real database engine from scratch is a
very non-trivial task. Good luck!"
(a) is the concrete level.
(b) is the overview level.
My response to Louis was positioned somewhere between these two.
I hope that helps you understand my message now.
Brett.
[35/36] from: joel:neely:fedex at: 27-Jun-2001 3:34
Hi, Brett,
Sorry for the lack of clarity!
Brett Handley wrote:
> > Please see my response to Louis for what I believe to be the
> > real source of his difficulties.
>
> Well you've got two on the list currently that I can see Joel.
>
> (a) "Try using the same file you just decompressed!"
>
That's the one I meant.
-jn-
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com
[36/36] from: louisaturk:eudoramail at: 7-Jul-2001 18:11
Hi Brett, Joel, Volker, and Anton,
Many thanks for your responses. Sorry to take so long to respond. I'm
studying all that you have said. It sounds like compression greatly
increases the chance of loosing data. I think I'll have to postpone using
this feature until I have more time to study it deeper.
Louis
At 10:40 PM 6/27/2001 +1000, you wrote:
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted