r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[All] except covered in other channels

shadwolf
4-Apr-2009
[3531]
making complicated thing the complicated way is just loss of time
Gregg
4-Apr-2009
[3532]
If you're going to measure KLOC, you have to measure it with historical 
data (e.g. the 100 lines that became 30 that became 10). And you 
have to account for design, testing, and domain understanding. Plus, 
do you have a well known target, or are you doing R&D?
btiffin
4-Apr-2009
[3533]
Didn't Carl just post about a new metric?  http://www.rebol.net/wiki/Load_Mold_Sizes
 rebols don't count lines or semicolons, we loadmoldflatcompress.
Janko
4-Apr-2009
[3534]
such smaller indexers can be very good to learn and good basis if 
you need something more custom around data retrieval.. I was searching 
few times if there is anything like a text indexer written in REBOL 
but I only saw it today :)
Sunanda
4-Apr-2009
[3535]
Continuing a thread about Skimp and the Mailing List archive from 
the REBOLweek group...

We have 40K messages, but we index threads.....So only 9500 or so 
:-)

There is a skimp index per year. Each index has 27 files (header 
+ A, B,C etc). 
That's 400-odd files. Total 4.5 meg.

Not kept in memory....We're runnings a CGI application, so loaded 
afresh each time. (The opsys may be cacheing for us).
Janko
4-Apr-2009
[3536]
cool
Sunanda
4-Apr-2009
[3537]
skimp also indexes the ALtme archive on REBOL.org 100,000+ usually 
very small messages:
http://www.rebol.org/aga-groups-index.r?world=r3wp
There are some additional data structutes to handle that.
Janko
4-Apr-2009
[3538x2]
it could be great solution for those that make web-apps in rebol 
and such things would often benefit greatly from search, setting 
up lucene/solr is quite bulky for this task except they reeeealy 
need search
hm.. so you are saying skim runs as CGI .. so you don't need to run 
separate server ( and VPS access ) for it?
Sunanda
4-Apr-2009
[3540]
Absolutely. The startup code for skimp is:
   do %skimp.r
Janko
4-Apr-2009
[3541x2]
hm.. that is big plus .. .I implemented solr (that is restfull server 
/ indexer on top of lucene ) for 2 clients and having something that 
wouldn't require vps and all that would be much better
if I have any other requests like this I will think of using SKIM 
, maybe I will also be able to contribute a little in that case
Sunanda
4-Apr-2009
[3543]
Here's a very simple skimp session:
  1. start it
  2. create an index called %my-index and add two docs to it
  3. search %my-index for the word "words"

   do %skimp.r
   skimp/add-words %my-index "doc1" "these are the words in doc1"

   skimp/add-words %my-index "doc2" "and these are the words in doc2"
   probe skimp/find-words %my-index ["words"]


(In real life, it may be a little more complicated as you may want 
to set some config options).
Janko
4-Apr-2009
[3544]
does it already have some scoring implemented if you query for more 
than 1 word?
Sunanda
4-Apr-2009
[3545]
No. It simply records if the word is there or not.


You will see scoring if you search the Script Library -- so "relevant" 
scripts come first.
That's done by having more than one skimp index:
    --  header index
    -- comments index
    -- strings index
    -- etc

And then scoring according to how many of those indexes contained 
your search words:
   http://www.rebol.org/site-search.r
Janko
4-Apr-2009
[3546]
aha, cool .. then you can probably also simply make it so that for 
example header is 3x more important than comments etc... (like "boosting" 
in lucene)
Sunanda
4-Apr-2009
[3547]
Exactly -- we apply empirical weightings to get results that look 
good :-)
Janko
4-Apr-2009
[3548]
cool, then it's not so much less powerfull than solr .. at least 
how I used it.. I made custom stemming and synonimes outside of it 
anyway
Sunanda
4-Apr-2009
[3549]
No custom stemmings as yet. 


Butt the latest addition to the Library is Porter Stemming. That 
gives me some ideas about upgrading the Library's indexes:
http://www.rebol.org/view-script.r?script=porterstemming.r
Janko
4-Apr-2009
[3550]
I wouldn't build stemmer into search engine .. keep it slim :) .. 
In my case with solr I  process the docs before I index them and 
then I do the same to search queries .. it's all outside of solr..
Sunanda
4-Apr-2009
[3551]
Skimp already, in effect has a plugin: make-word-list. That defines 
what a "word" is.


One way to implement stemming would be to make stemming a plugin 
to make-word-list. But I have not really thought about that yet:

http://www.rebol.org/documentation.r?script=make-word-list.r
Janko
4-Apr-2009
[3552]
I needed a specific stemmer ... porter stemmer fits english well 
but not so my language and I needed to do a lot of synonims for specific 
field search was used in
[unknown: 5]
7-Apr-2009
[3553x2]
For anyone that needs a backlink to their REBOL related pages of 
their website, I offer a REBOL links thread here http://www.tretbase.com/forum/viewtopic.php?f=8&t=33
It does get scanned by Google, Yahoo, MSN, and other bots.
Sunanda
7-Apr-2009
[3555]
Good idea Paul.....

Another opportunity to present yourself to the world in a REBOL context 
is the members' pages at REBOL.org. Also highly friendly to Google 
and other search engines (No 1 in in Google for [I've tinkered with 
a lot of different languages] :-)
http://www.rebol.org/lmp-index.r
[unknown: 5]
7-Apr-2009
[3556]
I need to look into that.  I think I only submitted one script to 
REBOL.org so far.
Sunanda
7-Apr-2009
[3557]
You do not have to be a script contributer to have a library member's 
page.
Brock
7-Apr-2009
[3558]
ah, slipping a little, now 'I've tinkered with a lot of different 
languages' is listed as the second page in Google!  ;-)
Sunanda
7-Apr-2009
[3559]
Google has many data centers, none quite synchronised. Results are 
semi reandom :-)
[unknown: 5]
8-May-2009
[3560]
Anyone here have a patent and if so, where is your patent?  US, elsewhere, 
etc...    Also, what concerns did you have when you filed your patent 
or problems you faced?
Henrik
8-May-2009
[3561]
Never did. Never will.
Geomol
8-May-2009
[3562]
Never did. Probably never will.
Dockimbel
8-May-2009
[3563]
Paul, I guess that it's about your compression methods. Maybe this 
analysis can help or even enlight you : http://gailly.net/05533051.html
Geomol
8-May-2009
[3564]
That's quite funny! (And stupid.)
Reichart
8-May-2009
[3565x2]
I have patents, just type in the word patent and my last name...
I have helped file patents for 25 years... I'm not sure what your 
real question is though.
[unknown: 5]
8-May-2009
[3567x2]
Thanks Doc, yeah that document is tied to comp.compression group. 
 I know that group and have read their materials.  Thanks anyway 
though.
Reichart, do you have any patents through other offices other than 
the US Patent office?  Curious about costs you typically seen for 
filing and is your patents utility or design patents?
Reichart
8-May-2009
[3569]
I hold patents in many fields, and around the world.  Costs are tricky. 
 In general, a patent is not worth it in the big scheme of things. 
 You best have something amazing.  Today, coming up with a patent 
in compression  would not really matter, since it would just annoy 
people, i.e. the Unisys patent inside GIF.


Let's say you came up with a way to make something 50% smaller even, 
but if you patent it, and no one will touch it, does it really matter? 
 And then, WHO would touch it, knowing that it is not open?  Is it 
really worth it.  Keep in mind, I made a lot of money specifically 
selling compression technology.  If you did much better, like 70% 
over the next best open system, then it might become worth something.


You have to weigh the value.  But figure to file outside of America 
will cost half again what it cost in America.  That can range from 
you doing it your self (a few grand all said, to an average of $8K 
- $12K ).


I personally have never paid less than $20K just in America. But 
my patents tend to be well researched (better than most people do 
for their patents).

The patent "search" is the expensive part.
[unknown: 5]
8-May-2009
[3570]
Well the way I look at it - if I do get compression working then 
I'm willing to pay $10k for a patent and go after patents for it 
internationally.   I think it will market if it works.
Henrik
9-May-2009
[3571]
You would have to defend the patent as well, if you want to keep 
it valid. That may cost way more than the patent itself. I personally 
think that patenting an algorithm is a surefire way of avoiding widespread 
use.
Pekr
9-May-2009
[3572]
We wanted ti patent our CCD camera ethernet interface, but we were 
adiced to not to do so, because even if there would be some patent 
violation, you have to start court case in the given country, and 
we would have to be really rich, to affort that ...
Janko
9-May-2009
[3573]
Would patenting a compression algoritm be a "software patent" ?
ICarii
9-May-2009
[3574]
yes - patenting algorithms is something that only the USA seems to 
be doing.. it gets a little stupid in the end..
[unknown: 5]
9-May-2009
[3575]
Thanks for the comments.  Regarding protection of patents - this 
is an area where I believe the patent search is important.  If your 
lawyers have done a good patent search then I think the only defense 
your doing is for those that are infringing on your patent.  In which 
case I think you stand to make MORE $$$ if that is the case.
Robert
9-May-2009
[3576]
Paul, forget patents. Not worth. I hold 13, costs a lot, takes endless 
time and if a big player is infringing it you won't have enough $$$ 
to get your rights.
[unknown: 5]
9-May-2009
[3577]
Well let's discuss that Robert.  How can it not be worth it if I 
hold the patent?  I would get the $$$ if I win the case against anyone 
infringing which will be the case so I don't get your conclusions 
here.
Anton
9-May-2009
[3578]
if: func [cond then-block [block!]][do then-block]
TomBon
9-May-2009
[3579]
paul, what robert mean is that a patent is worthless until you have 

enough power to defend it. Unfortunatly I have the same experience,

to expensive and if a real big player like to take it, they will 
get 

it even if it takes years, your money and nerves. there are enough

simple and dirty tricks to dry you out. only speed helps here in 
my
opinion.
[unknown: 5]
9-May-2009
[3580]
I guess I don't get how they can TAKE it?