r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

World: r3wp

[Power Mezz] Discussions of the Power Mezz

the code is using a number of tricks to be "fast" (esp. expand-macros.r), 
so it's not as clean as it could be.
Hi, first thanks for making and open sourcing power-mezz. 

I am trying to use load-html and am getting some strange results 
if sems it makes for example recursing [ html [ html [ html  ..... 
]]] on my simple html input (and on real one that I tried). I prepared 
two examples to make the point as clear as possible.

http://paste.factorcode.org/paste?id=2263(notice the stack owerflow 
I added 2 more cases to the paste (2 annotaitons). load-html seems 
quite complex since it uses many other modules (that I don't understant 
either).. so I rather see if you find something obvious in my approach 
or the bug in power-mezz
First: you only need to import %mezz/load-html.r in your examples. 
You're not using the other modules; they will be loaded automatically 
by load-html.r - you never need to worry about dependencies.
Second: your problem is that you are trying to mold the result, which 
is a tree where each node has a reference to the parent node. (much 
like faces in R2). That's why you see the "loop".
there is a mold-tree function in %mezz/trees.r if you want to mold 
the tree. Or, you could simply use form-html to pretty print the 
tree for you.
Eg. for your first example:

t: load-html p
print mold-tree t

[root [] [html [] [head [] [title [] [text [value "t"]]]] [body [] 
[h2 [] [text [value "HEADING"]]] [p [] [text [value "first para"]]] 
[p [] [text [value "second para"]]]]]]

print form-html/with t [pretty?: yes]

        <p>first para</p>
        <p>second para</p>
(the pretty? option to form-html is something i only use for debugging, 
so it's not as pretty as it should be i guess)
You can also do things like:

>> mold-tree get-node t/childs/html/childs/head/childs/title
== {[title [] [text [value "t"]]]}
get-node and set-node are also from %mezz/trees.r ; most likely you 
don't want to mess around with %mezz/macros/trees.r , that is deep 
vodoo i use to make the html filter fast.
(if you have performance problems, we'll talk about it :)
other examples:

>> get-node t/childs/html/childs/head/childs/title/childs/text/prop/value 
== "t"

>> get-node t/childs/html/childs/body/childs/h2/childs/text/prop/value 
Also note that:

>> print form-html/with load-html "<p>A paragraph!" [pretty?: yes]
        <p>A paragraph!</p>
ie. load-html tries to cope with malformed input as much as possible.
wow, thank you a lot! I knew this was to obvious "bug" to be real 
and I am probably doing something wrong. GREAT!

I initially imported only needed modules but got errors .. ( I will 
try and report ) the errors went away as I manually imported them. 
Just a second
very good that you cope with bad html .. I will need that functionality 
because no html is perfect.
I was planing to use beaurtifullsoup if you didn't but since you 
do that is even much better
I tried now, the problem with import was that I didn't set the absolute 
path to load-module/from before.
It all works now according to your example.. and I tested and it 
handles improper html very well! Thanks!
I looked at html-rules in load-html and I am stunned by how well 
the code / dialect is
there are a few things that it still can't do (the dialect i mean), 
but it's very powerful on the things it can do :) The documentation 
is here: http://www.rebol.it/power-mezz/mezz/niwashi.html
it's one of the parts that i think it's best documented, so it's 
worth reading.
Gabriele - what is that? New templating system? Reminds me of Temple 
:-) (sorry, not following the discussion)
No, though it could easily be used to reimplement Temple, this time 
with the ability to load any html.