World: r3wp
[Core] Discuss core issues
older newer | first last |
Anton 10-May-2007 [7846] | So I usually look for short key strings which are unlikely to change to jump ever closer to the data I need. |
TimW 10-May-2007 [7847] | Yeah, I was actually parsing on the div class sections, but I can just as easily ''parse on the class names. I just liked being able to get to the data by throwing the tags away like [thru <div class="x"> any tag! copy data string! to end]. |
Brock 10-May-2007 [7848x5] | I do the same as Anton. Grab the smallest unique text to bracket the content you want using parse first. I then use load/markup to get at the other bits. |
I was actually thinking of building a dialect to extract the data that I want, but not certain how to proceed with this as each row of the table I need to grab although similar is always missing the odd element so using element /x will not always give me the same data. I have found approx 10 variations in the data for the tables I'm trying to pull the data out of, so not sure if that is the best way or not. Any advice would be great. | |
here's an example of the type of content I am trying to effectively parse. | |
http://www.nhl.com/nhl/app?service=page&page=Schedule | |
I was using the length of the returned block after the load-markup, but I would likely be better off defining a simple parse statement to grab the contents of the rows. | |
TimW 10-May-2007 [7853] | Oh. that worked great. I just read the string, found the div tag, then loaded it from there and i didn't have to change my code. |
btiffin 10-May-2007 [7854] | Brock; Check out Daniel's rebol.org submissons... mdlparser, quickparser and rblxelparser. Might be a few hints and tips. |
Brock 10-May-2007 [7855] | will do, thanks Brian. |
Terry 12-May-2007 [7856] | I have a question.. What's the best way to iterate through a hash/dict! etc looking for values to multiple keys with the same name ie: n:[one "test1" two "test2" one "test3"] where it returns an array of all 'one' key values? |
btiffin 12-May-2007 [7857x5] | remove-each [key item] n [not key = 'one] |
Of course, you'd want a copy :) | |
iirc, newer REBOLs may have other '-each' native! words. | |
During a Sunanda ML contest, someone came up with a parse solution to compacting ...it was fairly blazing in speed, but I got busy after phase one of the contest and didn't follow as close as I would have liked to. Check http://www.rebol.org/cgi-bin/cgiwrap/rebol/ml-display-thread.r?m=rmlPCJC for some details...watch the entries by Romano, Peter and Christian. The winning code ended up in http://www.rebol.org/cgi-bin/cgiwrap/rebol/view-script.r?script=rse-ids.r It's not the same problem set, but parse seems to be a bit of a magic bullet, speed wise. | |
I'm completely new to parse...haven't timed it...it probably breaks... out: copy [] key: 'one parse n [to key any [thru key (insert tail out first n insert tail out second n) skip]] | |
Terry 12-May-2007 [7862] | Im thinking parse is not the way to go.. I'm referring to very large hash tables) |
Chris 12-May-2007 [7863x3] | ; This probably isn't efficient enough either: remove-dups: func [block [any-block!] /local this-key][ forskip block 2 [ this-key: pick block 1 remove-each [key val] next next block [this-key = key] ] block ] |
; Parse? remove-dups: func [block [any-block!] /locals this mk mk2][ parse block [ any [ set this word! skip mk: any [to this mk2: (remove/part mk2 2)] :mk ] ] block ] | |
Sorry, read the problem first :) | |
btiffin 12-May-2007 [7866] | Chris; :) I was using the remove-each for it's nativeness. Terry; Parse has proven itself to be very fast. It may be faster than remove-each [key item] copy n [not key = 'one] for large sets Otherwise foreach [key item] n [if key = 'one [insert tail out reduce [key item]] But I'll bet the parse is faster (assuming it works...that is one of my first parse's) |
Chris 12-May-2007 [7867] | ; Parse, this time addressing the spec: consolidate: func [block [any-block!] /locals key val mk dup][ parse block [ any [ set key word! val: skip mk: opt [ to key dup: (change/only val compose [(val/1)]) any [to key dup: (append val/1 dup/2 remove/part dup 2)] ] :mk ] ] block ] |
btiffin 12-May-2007 [7868] | Terry; go with Chris...He'll not lead you wrong. :) |
Chris 12-May-2007 [7869] | B: I wonder if 'parse or other loops work faster with hashes? |
btiffin 12-May-2007 [7870] | Hmm. Good point. R3 is going to have other native! '-each' words right? |
Terry 12-May-2007 [7871] | um .. not trying to reomve dupes.. trying to collect them all (you know, like pokemon cards) |
btiffin 12-May-2007 [7872] | Terry; Sunanda's contest was using the parse technique on some very large indexes...for skimp. A foreach solution was actually the fastest for a teenie window of time, until Romano posted the parse winner. But...it was a different problem set. It was a very informative contest in terms of efficient REBOL coding. |
Chris 12-May-2007 [7873] | T: Yes, that's my last function (after I reread your post) |
Terry 12-May-2007 [7874x2] | These don't feel right. .. looking for the equiv. of SQL's "select value where key = 'one'" .. Isn't rifling through a 100mb hash table using parse similar to rifling through an un-indexed SQL table? |
The rse-ids.r file seems what Im looking for .. need to have a play. | |
btiffin 12-May-2007 [7876] | Yep. You could sort, find first, find last and copy the range? But that introduces sort... There is a blog entry about hash! but we have to wait till R3. RSN. Yeah, there was some high-level optimizing going on for that res-ids little beauty. :) |
Chris 12-May-2007 [7877] | ; one more :) select-all: func [block [any-block!] key /locals result val][ result: copy [] parse block [any [to key skip set val skip (append result val)]] result ] |
btiffin 12-May-2007 [7878] | That I like... :) And Terry; You may be surprised at the timings of that to sequence. |
Chris 12-May-2007 [7879] | (thru key) would work as well, if not better, than (to key skip) :) |
btiffin 12-May-2007 [7880] | Or thru sequence :) |
Terry 12-May-2007 [7881] | What do you use for the key? a word value .. 'one ? |
Chris 12-May-2007 [7882] | Yes -- select-all data 'one |
Terry 12-May-2007 [7883x2] | doesn't work |
sorry, does work.. | |
Chris 12-May-2007 [7885] | >> n: make hash! [one "test1" two "test2" one "test3"] == make hash! [one "test1" two "test2" one "test3"] >> select-all n 'one == ["test1" "test3"] |
Terry 12-May-2007 [7886] | (I've been doing so much javascript and php lately, im really starting to lose whatever rebol understanding I once had) |
Chris 12-May-2007 [7887] | Ak, js is passable, php? :) |
Terry 12-May-2007 [7888] | This sort of hash table won't work anyway due to the 'word limitations of the current core |
Chris 12-May-2007 [7889] | It will work with other keys, but has the same issue as 'select in that values can be mistaken for keys. |
Terry 12-May-2007 [7890x2] | exactly (or as the French say... exact) |
Im tryin real hard to get my simple data into a Rebol hash table, or blocks.. whatever.. but it seems like traditional Relational DB is the way to go.. even used only as a flat file DB :( | |
Chris 12-May-2007 [7892] | ; This may slow things down a little: select-all: func [block [any-block!] key /locals result val fork][ result: copy [] parse block [ any [ thru key fork: (fork: pick [[set val skip (append result val)][]] even? index? fork) fork ] ] result ] |
btiffin 12-May-2007 [7893] | I'm a little confused... what word limit are you bumping into? I thought the limit was only for set-words? Can't blocks have any number of symbols inside? Until ram is gone... |
Chris 12-May-2007 [7894] | I guess the alternative is *sigh* waiting for R3 where these issues will be addressed... |
Tomc 12-May-2007 [7895] | terry is ordering your data when you insert it prohivitive? |
older newer | first last |