Rebol & XML
[1/31] from: AJMartin::orcon::net::nz at: 5-Aug-2003 22:04
Thanks, Bryan and Will!
Bryan wrote:
> I've thought about doing the same, mainly cause I want to have xpath in
Rebol, and to do that I need a decent xml parser. I'm sure you're better
qualified than me for doing it but if you need any help on the project I'd
be glad to help.
I've discovered that Gavin's parse-xml is based on SAX or the event model of
processing XML. At the moment, my thoughts are going towards a DOM model,
because Rebol is oriented that way, I feel, in reading and writing all of a
file at once. The DOM model builds a tree in memory. I want to access the
various values with path! values in Rebol. Here's a little XML (XMLSS from
MS Excel 2002):
XML: {<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>Andrew John Martin</Author>
<LastAuthor>Andrew John Martin</LastAuthor>
<Created>2003-08-05T02:10:56Z</Created>
<LastSaved>2003-08-05T02:10:57Z</LastSaved>
<Company>Colenso High School</Company>
<Version>10.4219</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<DownloadComponents/>
<LocationOfComponents HRef="file:///\\"/>
</OfficeDocumentSettings>
</Workbook>
}
I'd like to processs the above and then access the author's name with Rebol
script like:
XML/Workbook/DocumentProperties/Author
And set it with Rebol script like:
XML/Workbook/DocumentProperties/Author: "Andrew Martin"
Also we should think about several tags at the same level of nesting, like
in table:
row
cell
cell
cell
Unfortunately, there's a problem with accessing the attributes of a tag! For
example, what's the path! value for accessing the value of the "xmlns"
attribute in the "DocumentProperties" tag?
XML/Workbook/DocumentProperties/________
Or perhaps I could use:
XML/Workbook/DocumentProperties/_Attribute/xmlns
Where "_Attribute" is the magic word for accessing attributes of a tag?
What do people think? Is there a better or more simpler way that I've
overlooked?
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[2/31] from: bry:itnisk at: 5-Aug-2003 15:06
> At the moment, my thoughts are going towards a DOM model,
>because Rebol is oriented that way, I feel, in reading and writing all
of a
>file at once.
Definitely should be DOM, dom is more familiar to most developers and
more popular than SAX.
[ The DOM model builds a tree in memory. I want to access the
various values with path! values in Rebol. Here's a little XML (XMLSS
from
MS Excel 2002):
XML: {<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>Andrew John Martin</Author>
<LastAuthor>Andrew John Martin</LastAuthor>
<Created>2003-08-05T02:10:56Z</Created>
<LastSaved>2003-08-05T02:10:57Z</LastSaved>
<Company>Colenso High School</Company>
<Version>10.4219</Version>
</DocumentProperties>
<OfficeDocumentSettings
xmlns="urn:schemas-microsoft-com:office:office">
<DownloadComponents/>
<LocationOfComponents HRef="file:///\\"/>
</OfficeDocumentSettings>
</Workbook>
}
]
>I'd like to processs the above and then access the author's name with
Rebol
>script like:
> XML/Workbook/DocumentProperties/Author
which is basically an xpath. I think it should probably be something
like
xpath XML "/Workbook/DocumentProperties/Author"
>And set it with Rebol script like:
> XML/Workbook/DocumentProperties/Author: "Andrew Martin"
yeah that was something I was also considering, the possibility of an
xpath setting syntax in Rebol.
>Also we should think about several tags at the same level of nesting,
like
>in table:
> row
> cell
> cell
> cell
in the xpath data model of xml this would be taken care of via
position()
http://www.w3.org/TR/xpath#section-Node-Set-Functions
so that one has
row/cell[last()] returning the last cell node under row
row/cell[position() = 2] or row/cell[2] returning the second.
My idea was to have an object hierarchy that could be navigated in the
normal rebol manner, than have an xpath parser that would parse out
xpath strings to figure out the rebol path to something.
This might have problems though.
>Unfortunately, there's a problem with accessing the attributes of a
tag! >For
>example, what's the path! value for accessing the value of the "xmlns"
>attribute in the "DocumentProperties" tag?
<<quoted lines omitted: 4>>
>What do people think? Is there a better or more simpler way that I've
>overlooked?
xmlns is a namespace declaration and as such not an actual attribute,
depending on what specifications your parser supports, a completely
valid xml parser supporting just the original xml specification would
consider that as an attribute, however most parsers do not consider that
as an attribute because they also support namespaces.
well I think it needs to be abstracted one level
so the information we get out is something like this (this is probably
horribly wrong since I haven't had much occasion to use make object!,
and that I did have was a while ago):
xml: make object! [
element: make object![
name: "Workbook"
attributes: []
default-namespace: "urn:schemas-microsoft-com:office:spreadsheet"
namespaces:[o: "urn:schemas-microsoft-com:office:office"
x: "urn:schemas-microsoft-com:office:excel"
ss: "urn:schemas-microsoft-com:office:spreadsheet"
html : "http://www.w3.org/TR/REC-html40"]
childtree: make object![
element: make object![
name: "DocumentProperties"
.................... and so forth....................
]
]
]
]
consider if this has to handle xml like the following:
<doc>
<section>hi <p att="here">text</p> some more text</section>
</doc>
there has to be a way to get ahold of the various text nodes.
There are three textnodes under section. So we would need something like
this
xml: make object![
element: make object![
name: "doc"
childtree: make object![
element: make object![
name: "section"
childtree: make object![
t1: "hi"
element: make object![
name: "p"
attributes: [
att: "here"
]
t1: "text"
]
t2: "some more text"
]
]
]
]
]
okay, enough of that you get the point, it could probably be better
designed, but problems here:
if the name of an element has a namespace prefix:
element: "svg:svg"
then of course the svg prefix needs to be associated somewhere with the
svg namespace.
The same if an attribute is associated with a namespace prefix (this is
very rare)
Namespaces can be tricky, people have a lot of preconceptions about them
that do not always bear out, different xml dialects have subtly
different namespace processing models. Case in point is svg processing
model which insists that if an svg namespaced element is within an
element in a namespace the processor is unfamiliar with then the svg
namespaced element is removed from the parse tree. Most xml dialects of
course have a model of ignoring the unknown namespace and forging ahead.
It might be possible to have a top-level object that holds all document
namespaces, and use this as a way to optimize namespace checking, most
of the time namespaces are declared on the document element, if a
namespace isn't found there one can then try checking for it in the
local tree, but if it is there than one does not have to check in the
local tree.
The structure above of course means that you can't have as you wanted
before
XML/Workbook/DocumentProperties
But with this one could build an xpath interpreter ontop of it, or a
lightweight one really quick that allowed you to write that and then
went throught the steps.
It would then also allow for us to have functions like:
documentElement myxml
which would return "Workbook"
i.e. it would be possible to actually have something similar to a DOM
implementation for Rebol.
[3/31] from: brett:codeconscious at: 5-Aug-2003 23:50
Hi Andrew,
> I've discovered that Gavin's parse-xml is based on SAX or the event model
of
> processing XML. At the moment, my thoughts are going towards a DOM model,
> because Rebol is oriented that way, I feel, in reading and writing all of
a
> file at once. The DOM model builds a tree in memory. I want to access the
> ...
It has been a while since I looked at it, but I thought that the xml-object
that Gavin wrote did build a tree of objects. However I seem to recall that
the parser was not entirely complete.
> I'd like to processs the above and then access the author's name with
Rebol
> script like:
>
> XML/Workbook/DocumentProperties/Author
>
> And set it with Rebol script like:
>
> XML/Workbook/DocumentProperties/Author: "Andrew Martin"
It certainly looks neat, even if your underlying XML structure doesn't
support it, maybe a dialect can do the translation of your request. A
question to ponder though. How many instances of these would be literally in
your code, or would you need to build them up in other code?
> Also we should think about several tags at the same level of nesting, like
> in table:
>
> row
> cell
> cell
> cell
My XML is very limited. If we write a program to access a cell from the row
in this example are we implicitly encoding the structure of XML (DTD) into a
program?
> Unfortunately, there's a problem with accessing the attributes of a tag!
For
> example, what's the path! value for accessing the value of the "xmlns"
> attribute in the "DocumentProperties" tag?
>
> XML/Workbook/DocumentProperties/________
>
> Or perhaps I could use:
>
> XML/Workbook/DocumentProperties/_Attribute/xmlns
>
> Where "_Attribute" is the magic word for accessing attributes of a tag?
>
Perhaps the attributes of each element can be held seperately from the
element structures, and the same for namespaces.
The path notation then becomes a key to each of these three storage
structures. Therefore different aspects of a node could be:
Content XML/Workbook/DocumentProperties
Attributes XML/Workbook/DocumentProperties
Namespace XML/Workbook/DocumentProperties
etc. This is not well thought through - just throwing some ideas out.
By the way does namespaces represent contracts for behaviour, or are they
like rebol objects - a way to provide seperate contexts, or are they both?
> What do people think? Is there a better or more simpler way that I've
> overlooked?
I wish :^)
I think that if you consider the different ways you might want to process
the XML you will find different representations that can be useful. Also if
a DTD (are these obsolete yet?) is available - the extra information might
have an impact. E.g using a DTD maybe a program could be generated that
knows how to traverse the XML implicitly (because it was generated to do
so).
Not being much help am I? :^)
Regards,
Brett.
[4/31] from: nitsch-lists:netcologne at: 5-Aug-2003 16:59
Hi Andrew and all,
how about
XML/Workbook@/Document@/Author@
change find XML/Workbook@ Author@ "Andrew Martin"
Idea is to name tag as email, attributes as issue.
then one could write XML/Workbook@/#myAtribute too.
[Workbook@ [
#xmlns ".."
DocumentProperties@ [ #xmlns ".." Autor@ "Andrew John Martin" ]
OfficeDocumentSetting@ [#xmlns LocationOfComponents@ [#HRef="file:///\\"] ]
]]
the following snippet changes parse-xml to return attributes as issues.
now i can write XML/3/1/2/#from.
but tags should not be indexed, but named too IMHO.
; use issues for attribute-names, better path-syntax
xml-language: make xml-language [
;verbose: true
add-attr: func [name value] [
if none? second parent [parent/2: make block! 2]
insert insert tail second parent to-issue name value
]
]
-Volker
Am Dienstag, 5. August 2003 12:04 schrieb A J Martin:
[5/31] from: bry:itnisk at: 5-Aug-2003 22:25
a propos the subject, realized after posting
earlier today that there was of course the
same problem with elements that I saw with
textnodes, i.e multiple elements as children
of a node, this will still be a problem
though, it seems to me if you just used
element names, as in
body: make object! [
hi I'm some text that's a child of the body
node
p: "hi this is a paragraph"
p: "this is another paragraph"
]
obvious problem(s) there.
what I was doing before for textnodes was
suggesting that there be a naming standard
of t{number}
so t1: "text"
t2: "more text"
then whenever one navigated to spot a path,
from there one could find out how many
textnodes there were, the same solution
could be done for elements.
also as I indicated before there is the
possibility of attributes having namespace
prefixes, so on second thought the
attributes syntax could be changed to the
following:
attributes:[
att1:[name: "m:att" value: "here is some
text" namespace: "http://www.someuri.com/m"]
]
other probs not touched on yet, processing
instructions and comments.
the question is really if one is wanting to
build something good that handles ones own
problems, or if one is wanting to build
something that can be built on top of later
to handle other problems, I don't think it
would be too much of a problem to build an
xml-to-object that just got the names of
elements and attributes and stuff and such
and worked in most cases, I don't think
however that such a tool would cover all
cases (maybe wrong about that, I'm not good
enough in Rebol to judge my judgements about
anything there, but I am tolerably well
informed about xml issues), I think a
totally generic tool will not allow a
straight rebolpath of
xml/workbook/documentProperties there'll
have to be a translation step.
[6/31] from: AJMartin:orcon at: 6-Aug-2003 18:00
Here's my load-XML function. Note that it doesn't get the values the right
way around. I've got something wrong in the commented out section of the
code. Any one figure out what I'm doing wrong?
Load-XML: function [
[catch] "Loads XML as a Rebol compatible block of values."
XML [string! file!] "The XML string or file."
] [Content Stack Attribute Value Attribute_Value^ Text^ Declaration^ Name
Text Element^] [
Content: make block! 10
Stack: make block! 10
Attribute_Value^: [
WS* copy Attribute [some Alpha opt [#":" some Alpha]] {="} copy Value to
#"^"" skip (
insert tail Content reduce [
to issue! Attribute
Value
]
)
]
Text^: complement charset #"<"
Declaration^: [
"<?xml" (Name: 'xml) any Attribute_Value^ WS? "?>" (
Content: reduce [Name Content]
)
]
Element^: [
"<!--" thru "-->" | #"<" (
;push/only Stack reduce [Name Content]
;Content: make block! 6
) copy Name some Alpha any [Attribute_Value^] WS? [
"/>" | #">" (push Stack Name) [
some [
copy Text some Text^ (
if not empty? trim Text [
insert tail Content reduce [to word! Name Text]
]
)
| Element^
]
] "</" (Name: pop Stack) Name WS? #">"
] (
;insert tail last first Stack reduce [Name Content]
;set [Name Content] pop Stack
)
]
if file? XML [
XML: read XML
]
all [
probe parse/all/case XML [
WS? Declaration^
WS? Element^
WS? end
]
Content
]
]
Push: func [
"Inserts a value into a series and returns the series head."
Stack [series! port! bitset!] "Series at point to insert."
Value [any-type!] /Only "The value to insert."
][
head either Only [
insert/only Stack :Value
][
insert Stack :Value
]
]
Pop: function [
"Returns the first value in a series and removes it from the series."
Stack [series! port! bitset!] "Series at point to pop from."
][
Value
][
Value: pick Stack 1
remove Stack
:Value
]
Fail^: [to end skip] ; A rule that always fails.
Succeed^: [] ; A rule that always succeeds.
Octet: charset [#"^(00)" - #"^(FF)"]
Digit: charset "0123456789"
Digits: [some Digit]
Upper: charset [#"A" - #"Z"]
Lower: charset [#"a" - #"z"]
Alpha: union Upper Lower
Alphas: [some Alpha]
AlphaDigit: union Alpha Digit
AlphaDigits: [some AlphaDigit]
Control: charset [#"^(00)" - #"^(1F)" #"^(7F)"]
Hex: union Digit charset [#"A" - #"F" #"a" - #"f"]
HT: #"^-"
SP: #" "
LWS: charset reduce [SP HT #"^(A0)"]
LWS*: [some LWS]
LWS?: [any LWS]
LF: #"^(0A)"
WS: charset reduce [SP HT newline CR LF]
WS*: [some WS]
WS?: [any WS]
Graphic: charset [
#"^(21)" - #"^(7E)"
#"^(80)"
#"^(82)" - #"^(8C)"
#"^(8E)"
#"^(91)" - #"^(9C)"
#"^(9E)" - #"^(9F)"
#"^(A1)" - #"^(FF)"
]
Printable: union Graphic charset reduce [SP #"^(A0)"]
Integer^: Digits
Decimal^: [Digits #"." Digits]
Money^: [#"$" Digits #"." 2 Digit]
; A Windows file name cannot contain any of these characters:
Forbidden: charset {\/:*?"<>|}
Line_End: [newline | end]
Blank_Line: [LWS? newline]
Blank_Lines: [any Blank_Line]
make object! [
Zone: [[#"+" | #"-"] 1 2 Digit #":" 2 Digit]
set 'Time^ [1 2 Digit #":" 1 2 Digit opt [#":" 1 2 Digit]]
Long-Months: remove map Rebol/locale/Months func [Month [string!]] [
reduce ['| copy Month]
]
Short-Months: remove map Rebol/locale/Months func [Month [string!]] [
reduce ['| copy/part Month 3]
]
Month: [1 2 Digit | Long-Months | Short-Months]
Separator: charset "/-"
Day: [1 2 Digit]
set 'Date^ [
[
[Day Separator Month Separator [4 Digit | 2 Digit]]
| [4 Digit Separator Month Separator Day]
]
opt [#"/" [Time^ opt Zone]]
]
]
make object! [
Permitted: exclude Printable Forbidden
Filename: [some Permitted]
Folder: [Filename #"/"]
Relative_Path: [some Folder]
Absolute_Path: [#"/" any Relative_Path]
set 'File^ [
[Absolute_Path opt Filename]
| [Relative_Path opt Filename]
| Filename
]
]
make object! [
Permitted: exclude Printable Forbidden
Drive^: [Alpha #":"]
Filename^: [some Permitted]
Folder^: [Filename^ #"\"]
Relative_Path^: [some Folder^]
Absolute_Path^: [#"\" any Relative_Path^]
set 'Local_File^ [Drive^ Absolute_Path^ opt Filename^]
]
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[7/31] from: AJMartin:orcon at: 6-Aug-2003 21:31
I figured out where I went wrong (I was 'push-ing too early). This function
loads XML very nicely.
Load-XML: function [
[catch] "Loads XML as a Rebol compatible block of values."
XML [string! file!] "The XML string or file."
] [Content Stack Attribute Value Attribute_Value^
Text^ Declaration^ Name Text Element^] [
Content: make block! 10
Stack: make block! 10
Attribute_Value^: [
WS* copy Attribute [
some Alpha opt [#":" some Alpha]
] {="} copy Value to #"^"" skip (
insert tail Content reduce [
to issue! Attribute
Value
]
)
]
Text^: complement charset #"<"
Declaration^: [
"<?xml" (Name: 'xml) any Attribute_Value^ WS? "?>" (
Content: reduce [Name Content]
)
]
Element^: [
"<!--" thru "-->" | #"<" Z: copy Name [
some Alpha (
push/only Stack reduce [to word! Name Content]
Content: make block! 6
)
] any [Attribute_Value^] WS? [
"/>" | #">" (push Stack Name) [
some [
copy Text some Text^ (
if not empty? trim Text [
insert tail Content Text
]
)
| Element^
]
] "</" (Name: pop Stack) Name WS? #">"
] (
insert tail last first Stack reduce [
to word! Name
either 1 = length? Content [
first Content
] [
Content
]
]
set [Name Content] pop Stack
)
]
if file? XML [
XML: read XML
]
all [
parse/all/case XML [
WS? Declaration^
WS? Element^
WS? end
]
Content
]
]
With the example from Rebol HQ:
XML: {
<?xml version="1.0"?>
<PERSON>
<NAME>Fred</NAME>
<AGE>24</AGE>
<ADDRESS>
<STREET>123 Main Street</STREET>
<CITY>Ukiah</CITY>
<STATE>CA</STATE>
</ADDRESS>
</PERSON>
}
probe X: load-XML XML
Here's the result:
[xml [#version "1.0"] PERSON [NAME "Fred" AGE "24" ADDRESS [STREET "123 Main
Street" CITY "Ukiah" STATE "CA"]]]
And with my example XML fragment:
XML: {<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>Andrew John Martin</Author>
<LastAuthor>Andrew John Martin</LastAuthor>
<Created>2003-08-05T02:10:56Z</Created>
<LastSaved>2003-08-05T02:10:57Z</LastSaved>
<Company>Colenso High School</Company>
<Version>10.4219</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<DownloadComponents/>
<LocationOfComponents HRef="file:///\\"/>
</OfficeDocumentSettings>
</Workbook>
}
Here's the results of various path! values:
>> x/workbook/documentproperties/author
== "Andrew John Martin"
>> x/workbook/officedocumentsettings/#xmlns
== "urn:schemas-microsoft-com:office:office"
:)
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[8/31] from: AJMartin:orcon at: 6-Aug-2003 21:53
> Load-XML: function [
'Load-XML can't handle multiple content and elements. For example:
XML: {<?xml version="1.0"?>
<document>
<h>Heading</h>
<p class="Initial">Hi this is a <b>bold</b> paragraph.</p>
<p>this is a <span>stuff</span> paragraph</p>
</document>
}
probe X: load-XML XML
[xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this is
a" b "b
old" "paragraph."] p ["this is a" span "stuff" "paragraph"]]]
>> x/document/h
== "Heading"
>> x/document/p
== [#class "Initial" "Hi this is a" b "bold" "paragraph."]
Access to second 'p is a bit tricky...
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[9/31] from: AJMartin:orcon at: 6-Aug-2003 22:07
Hi, Volker.
You wrote:
> Idea is to name tag as email, attributes as issue.
> then one could write XML/Workbook@/#myAtribute too.
Thanks for the idea, Volker! I managed to do without convert tag names to
email!. Am I missing something important? ::worried look::
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[10/31] from: bry:itnisk at: 6-Aug-2003 12:09
How's it handling instances of nodes with the same name, and multiple
text nodes interlaced in the tree, also I've put in a processing
instruction, the idea behind processing instructions is that they can be
used to pass info to the processor, so in this case I put in some simple
rebol code as an example PI :
<?xml version="1.0"?>
<doc>
here's some text
<p>para 1</p>
<?rebol-process call: "stringvalue" print call?>
<p>para 2</p>
here's some more text
<p:p xmlns:p="http://www.uris.org/p">not a para</p:p>
</doc>
I'd try it here but I'd have to go through and fix some formatting
errors Outlook put into the Rebol code. Hmm, I should probably use Rebol
to access mail from this list anyway.
[11/31] from: AJMartin:orcon at: 6-Aug-2003 22:20
Hi, Brett!
You wrote:
> It has been a while since I looked at it, but I thought that the
xml-object that Gavin wrote did build a tree of objects. However I seem to
recall that the parser was not entirely complete.
Gavin's comments in his scripts indicate that he was going for the SAX event
model approach? I didn't look any further than that (call me slack...).
> > XML/Workbook/DocumentProperties/Author: "Andrew Martin"
> How many instances of these would be literally in your code, or would you
need to build them up in other code?
I'm uncertain about that at this point. I know that I can relatively easily
build up path! values in Rebol.
> If we write a program to access a cell from the row in this example are we
implicitly encoding the structure of XML (DTD) into a program?
I think that's unavoidable; after all some program eventually has to
understand the structure to work with it. I feel that the goal for 'Load-XML
is get rid of a lot of "drag" in checking for end-tags and so on;
refactoring out the common code in converting XML to Rebol values.
> By the way does namespaces represent contracts for behaviour, or are they
like rebol objects - a way to provide seperate contexts, or are they both?
I know very little about namespaces. I think they could be modeled as a
context (or object!) in Rebol.
> I think that if you consider the different ways you might want to process
the XML you will find different representations that can be useful.
Can anyone think of alternative representations of a XML document? There is:
* Text;
* SAX or event model;
* DOM or tree model;
* _________________;
> Also if a DTD (are these obsolete yet?) is available - the extra
information might have an impact. E.g using a DTD maybe a program could be
generated that knows how to traverse the XML implicitly (because it was
generated to do so).
I think a DTD parser could be very helpful in writing a generic XML script
that does generic things.
> Not being much help am I? :^)
Thanks, Brett! :)
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[12/31] from: AJMartin:orcon at: 6-Aug-2003 22:31
Bryan wrote:
> How's it handling instances of nodes with the same name, and multiple text
nodes interlaced in the tree,
Given this XML:
XML: {<?xml version="1.0"?>
<document>
<h>Heading</h>
<p class="Initial">Hi this is a <b>bold</b> paragraph.</p>
<p>this is a <span>stuff</span> paragraph</p>
</document>
}
'Load-XML produces:
[xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this is
a" b "b
old" "paragraph."] p ["this is a" span "stuff" "paragraph"]]]
Which is OK for text processing, I feel.
> also I've put in a processing instruction, the idea behind processing
instructions is that they can be used to pass info to the processor, so in
this case I put in some simple rebol code as an example PI :
> <?rebol-process call: "stringvalue" print call?>
I haven't seen this before now. Is it something you've just thought of (and
so we can change it to better suit Rebol and users)? Or is this related to
some XML standard?
If it's something we can change, I think I'd like to see it as:
<?Rebol [print 3.14159]?>
Or perhaps I've misunderstood?
I tried your example:
XML: {
<?xml version="1.0"?>
<doc>
here's some text
<p>para 1</p>
<p>para 2</p>
here's some more text
<p:p xmlns:p="http://www.uris.org/p">not a para</p:p>
</doc>
}
and discovered that 'Load-XML doesn't handle tags in a name-space. :(
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[13/31] from: AJMartin:orcon at: 6-Aug-2003 22:44
A better version of 'Load-XML:
Load-XML: function [
[catch] "Loads XML as a Rebol compatible block of values."
XML [string! file!] "The XML string or file."
] [
Content Stack Attribute Value Attribute_Value^ Namespace_Name^
Text^ Declaration^ Name Text Element^
] [
Content: make block! 10
Stack: make block! 10
Namespace_Name^: [Alpha any AlphaDigit opt [#":" Alpha any Alphadigit]]
Attribute_Value^: [
WS* copy Attribute Namespace_Name^ {="} copy Value to #"^"" skip (
insert tail Content reduce [
to issue! Attribute
Value
]
)
]
Text^: complement charset #"<"
Declaration^: [
"<?xml" (Name: 'xml) any Attribute_Value^ WS? "?>" (
Content: reduce [Name Content]
)
]
Element^: [
"<!--" thru "-->" | #"<" Z: copy Name [
Namespace_Name^ (
push/only Stack reduce [
to word! Name Content
]
Content: make block! 6
)
] any [Attribute_Value^] WS? [
"/>" | #">" (push Stack Name) [
some [
copy Text some Text^ (
if not empty? trim/lines Text [
insert tail Content Text
]
)
| Element^
]
] "</" (Name: pop Stack) Name WS? #">"
] (
insert tail last first Stack reduce [
to word! Name
either 1 = length? Content [first Content] [Content]
]
set [Name Content] pop Stack
)
]
if file? XML [
XML: read XML
]
all [
parse/all/case XML [
WS? Declaration^
WS? Element^
WS? end
]
Content
]
]
With Bryan's test XML (modified by me!):
XML: {
<?xml version="1.0"?>
<doc>
here's some text
<p>para 1</p>
<p>para 2</p>
here's some more text
<p:p xmlns:p="http://www.uris.org/p">not a para</p:p>
</doc>
}
I'm getting:
[
xml [#version "1.0"] doc [
"here's some text" p "para 1"
p "para 2" "here's some more text"
p:p [#xmlns:p "http://www.uris.org/p" "not a para"]]]
Note that the "p:p" is converted to a word! value, currently. I feel it
should be a path! value, like this:
p/p [xmlns/p "http://www.uris.org/p" "not a para"]
As too should the attribute names as well.
What do people think?
Andrew J Martin
ICQ: 26227169
http://www.rebol.it/Valley/
http://valley.orcon.net.nz/
http://Valley.150m.com/
[14/31] from: bry:itnisk at: 6-Aug-2003 13:49
[
XML: {<?xml version="1.0"?>
<document>
<h>Heading</h>
<p class="Initial">Hi this is a <b>bold</b> paragraph.</p>
<p>this is a <span>stuff</span> paragraph</p>
</document>
}
'Load-XML produces:
[xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this
is
a" b "b
old" "paragraph."] p ["this is a" span "stuff" "paragraph"]]]
Which is OK for text processing, I feel.]
I guess so, shouldn't it give p["this is a" span ["stuff"] "paragraph"]
though.
> <?rebol-process call: "stringvalue" print call?>
>I haven't seen this before now. Is it something you've just thought of
(and
>so we can change it to better suit Rebol and users)? Or is this related
to
>some XML standard?
Processing Instructions aren't used that much, according to the xml
standard a PI is supposed to be passed to the processing application,
which can then do with it what it will, ignore it, parse it for other
stuff. An application which does make use of PIs is Apache Cocoon.
A common PI is the stylesheet PI, something like <?xml-stylesheet
href="some.xsl"?> (been a while since I used that, link here
http://www.w3.org/TR/xml-stylesheet/)
Some people don't like Processing Instructions and would like to get rid
of them in the next version of XML, but it doesn't look like they will
be gotten rid of.
So a PI can be used to pass evaluatable code if that's what one wants to
do with it.
A PI is anything with a structure <?....?> not at the top of the
document, that is to say the xml declaration <?xml version="1.0"?> is
not a PI despite the structural similarity.
>If it's something we can change, I think I'd like to see it as:
yeah it can be changed. As long as it has <? ?> structure it can contain
anything between the ? ?
>and discovered that 'Load-XML doesn't handle tags in a name-space. :(
yeah I thought there might be problems, another problem is in the rare
occurrence of namespace prefixed attributes.
In fact I think I've seen some Excel Workbooks with that.
[15/31] from: sqlab:gmx at: 6-Aug-2003 14:02
Hello Andrew
I would propose something like
>Get-XML x/document/p
is the same as ">Get-XML x/document/p/1"
Then we can do
>Get-XML x/document/p/2
..
..
>Get-XML x/document/p/:n
with a function similar to
Get-XML: func ['x [path ..] [...]
Or just an XML: context [
load: func [...
set: func [..
get: func [
]
AR
> > Load-XML: function [
> 'Load-XML can't handle multiple content and elements. For example:
<<quoted lines omitted: 24>>
> To unsubscribe from this list, just send an email to
> [rebol-request--rebol--com] with unsubscribe as the subject.
--
COMPUTERBILD 15/03: Premium-e-mail-Dienste im Test
--------------------------------------------------
1. GMX TopMail - Platz 1 und Testsieger!
2. GMX ProMail - Platz 2 und Preis-Qualitätssieger!
3. Arcor - 4. web.de - 5. T-Online - 6. freenet.de - 7. daybyday - 8. e-Post
[16/31] from: bry:itnisk at: 6-Aug-2003 15:03
>yeah I thought there might be problems, another problem is in the rare
>occurrence of namespace prefixed attributes.
Actually considering not too rare in cases of xml coming out of
academia, or certain standards, where the xml namespace will be used,
especially in example like following:
<?xml version="1.0"?>
<doc>
<p xml:lang="EN">Good Morning!</p>
<p xml:lang="EN-US">Howdy!</p>
</doc>
note that the xml namespace does not need to be declared anywhere.
[17/31] from: bry:itnisk at: 6-Aug-2003 15:26
Two last things that would need to be handled, in order of priority it
seems to me:
Cdata sections
Syntax: <![CDATA {can be anything in here including malformed xml} ]]>
http://www.w3.org/TR/REC-xml#sec-cdata-sect
And comments
http://www.w3.org/TR/REC-xml#sec-comments
[18/31] from: andrew:martin:colenso:school at: 7-Aug-2003 10:06
The attached version of 'load-XML.r now handles these cases. (It doesn't
yet handle name spaces correctly yet, though!)
Andrew J Martin
Attendance Officer &
Information Systems Trouble Shooter
Colenso High School
Arnold Street, Napier.
Tel: 64-6-8310180 ext 826
Fax: 64-6-8336759
http://colenso.net/scripts/Wiki.r?AJM
http://www.colenso.school.nz/
[19/31] from: andrew:martin:colenso:school at: 7-Aug-2003 10:18
Bryan wrote:
> > p ["this is a" span "stuff" "paragraph"]
> ...shouldn't it give p ["this is a" span ["stuff"] "paragraph"]
though?
I deliberately did that in this line:
either 1 = length? Content [first Content] [Content]
So as to make it easier and simpler to use.
> another problem is in the rare occurrence of namespace prefixed
attributes.
> In fact I think I've seen some Excel Workbooks with that.
What do you think of using a path! value for these?
Andrew J Martin
Attendance Officer &
Information Systems Trouble Shooter
Colenso High School
Arnold Street, Napier.
Tel: 64-6-8310180 ext 826
Fax: 64-6-8336759
http://colenso.net/scripts/Wiki.r?AJM
http://www.colenso.school.nz/
DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally
liable) for materials distributed to or acquired from user e-mail accounts. You can report
any
misuse of an e-mail account to our ICT Manager and the complaint will be investigated.
(Misuse can come in many forms, but can be viewed as any material sent/received that
indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate
language and/or other issues described in our Acceptable Use Policy.)
All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0
Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]
[20/31] from: andrew:martin:colenso:school at: 7-Aug-2003 10:21
Curse that Listar!
-- Binary/unsupported file stripped by Listar --
-- Type: application/octet-stream
-- File: Load-XML.r
-- Desc: Load-XML.r
>> probe compress read %/c/Rebol/Values/Load-XML.r
#{
789C9D55516FDB36107E967EC55945B07898A5641936C04BE778498B65483AA3
3192018252D0122311964981A4E6BACB7ED0FEE58E47C9769C3ECD0F967877BC
BBEFBB8F541A7EE40B55431A02FE3EB0151FC337378A15A33F6F6FC8F65ED468
3BEA6DB126EB5C58678E9C19D01C9175DADA4A69344FE177B865DA0AE91DFC96
897A0C54EB42E95CC958721BCB2FE47DE08B3154D636E32459AFD7B17661B1B0
C93DAB6BBE4928E88FB5E494FA8BE5B954B52A373EF7475156D6A0E752351BED
16F02F7C7F727206BB2EBE83BD6DB1DF37670BDCF503BDCF5ADD28D303320E11
3003CC770CB95A35CC8A45CD6151AB7C09EA09FE6275CB4D97EB86C9B265A523
EF9D2C6B612ACF47BE946A5DF3A2E42B2E5D939EE7205AE80D9311E0E3425829
CC32C61A9DEF5ED54BAEE183B026AF2290F41C61526B2E90B5DC81903C2E7817
FFABE6D6C26F4C164896CB89CB8B5C151C6936B950ADD925CFFC9C392F5C3347
C8D96AA524CC98361CEE3D241DC1D1AC3555ACF1A99A58FB4D57CC22BE9F9269
5BB6C6268E61B2DF736D849263388DCFE213326561D8EB650C4FADCC2D0620F6
20CD99CDABECFFD01CB8E0D4582D6439802754E500F3CC2B4E59BC1D94268F0B
CF5CBD4B252D120F779661BAA9C5A8456B3BA4BBF5275A3F92FE4DC372FEC9BD
3D86C19C7FB68F70C5F39A69E640F81870767857D3541F5FD61AC38A2D3B0003
383D09032AFEDA7C500CC731AD9B8A01931BA0B72B510A0BAAB190BE89C6111C
B80BE7CEB2303884E15416040F77DF22A1CD660FF54145F8FB6DF48F8FF17C58
056FA2C72802B3140D1C63924048C3B5058BC7177A32352FDA9C539120C03DC2
98960F7675C84E19E92D73FFC3905E9CDC1C65D862743E188D22B0956E211A8D
7E89D03DBBF69E496F9FA039F819665AE5DC1837E16B89A36E4950068E67D743
609A8328A5C2B6E26E626392921F0FE49513B74568E7288BFD597AA2B0DCE755
1DC17177FBE162E8593E94C7C3DD845AF2D46CE7DDF341CAE8ACFBA07B9DF872
AE0D4F3AC5138B8783A1FC4183673051B2DE74F2EDEB20E56BA58B01BC2AF875
0DFE482ED70C0AD5014B0F9165048D5A8912C4F78C643998AE83AEB8AB35EC66
6E54DF78B09D28EEC1E9C1B3B7A604904E0905FB73F40C38F4F4F26A3A9FA61D
07148280A22CC38267A4BCACC38FEA7B02A92CF055633713C09E57492D24377E
5BD7C1D745EA227A7FD6BD0CFDF379EFE4EEDC78999C275B1534AAF1C0879E65
C78FE3C493F8EA68D4CC58BC79B4B107B3EACFC86E6064E1C25678C59FC25BA8
B92C6D35D9369EFA34FD58217D3160FA776A7EA9B65DBF7BBA43F6DC6538A10B
D27542D7B1E6FE8BED63F0034BAEC69D91045749CE0CDFEE081CEEFD23D3DBF6
19746B2E0B948B560BDED11676CD761DFA6AF85DC8C2FF007B9F7DD773080000
}
Andrew J Martin
Attendance Officer &
Information Systems Trouble Shooter
Colenso High School
Arnold Street, Napier.
Tel: 64-6-8310180 ext 826
Fax: 64-6-8336759
http://colenso.net/scripts/Wiki.r?AJM
http://www.colenso.school.nz/
DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally
liable) for materials distributed to or acquired from user e-mail accounts. You can report
any
misuse of an e-mail account to our ICT Manager and the complaint will be investigated.
(Misuse can come in many forms, but can be viewed as any material sent/received that
indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate
language and/or other issues described in our Acceptable Use Policy.)
All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0
Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]
[21/31] from: andrew:martin:colenso:school at: 7-Aug-2003 14:07
This version handles name spaces better.
[
Rebol [
Name: 'Load-XML
File: %Load-XML.r
Title: "Load XML"
Author: "A J Martin"
eMail: [Rebol--orcon--net--nz]
Web: http://www.rebol.it/Valley/
Owner: "Aztecnology"
Rights: "Copyright (c) 2003 A J Martin, Aztecnology."
Tabs: 4
Purpose: "Loads XML as a Rebol compatible block of values."
Language: 'English
Acknowledgements: [
"bryan" [bry--itnisk--com]
"Volker Nitsch" [nitsch-lists--netcologne--de]
"Brett Handley" [brett--codeconscious--com]
]
Needs: [%"Common Parse Values.r" %Push.r %Pop.r]
Date: 7/August/2003
Version: 1.5.0
]
make object! [
Namespace_Name: func [Name [string! word! path!] Type
[datatype!]] [
either string? :Name [
either found? find Name #":" [
to path! map parse/all Name ":" func
[Word [string!]] [
to word! Word
]
] [
to Type Name
]
] [
:Name
]
]
NameChar: union AlphaDigit charset ".-_:"
Name^: [[Alpha | #"-" | #"_"] any NameChar]
Comment^: ["<!--" thru "-->"]
PI^: ["<?" thru "?>"]
Miscellaneous^: [any [PI^ | Comment^]]
set 'Load-XML function [
[catch] "Loads XML as a Rebol compatible block of
values."
XML [string! file!] "The XML string or file."
] [
Content Stack
Attribute Value Attribute_Value^
Text^ Declaration^ Name Text Element^ Miscellaneous^
] [
Content: make block! 4
Stack: make block! 10
Attribute_Value^: [
WS* copy Attribute Name^ #"=" [
#"^"" copy Value to #"^"" skip
| #"'" copy Value to #"'" skip
] (
insert tail Content reduce [
Namespace_Name Attribute issue!
Value
]
)
] =09
Text^: complement charset #"<"
Declaration^: [
"<?xml" (Name: "xml") any Attribute_Value^ WS?
?>
(
Content: reduce [
Namespace_Name Name word!
Content
]
)
]
Element^: [
#"<" copy Name [
Name^ (
push/only Stack reduce [
Namespace_Name Name
word!
Content
]
Content: make block! 6
)
] any [Attribute_Value^] WS? [
"/>" | #">" (push Stack Name) [
some [
Comment^ | PI^ |
Element^
| "<![CDATA[" copy Text
to "]]>" 3 skip (
if not empty?
trim Text [
insert
tail Content Text
]
)
| copy Text some Text^ (
if not empty?
trim/lines Text [
insert
tail Content Text
]
)
]
] "</" (Name: pop Stack) Name
WS? #">"
] (
insert tail last first Stack reduce [
Namespace_Name Name word!
either 1 = length? Content
[first Content] [Content]
]
set [Name Content] pop Stack
)
]
=09
if file? XML [
XML: read XML
]
all [
parse/all/case XML [
WS? Declaration^
WS? Miscellaneous^
WS? Element^
WS? Miscellaneous^
WS? end
]
Content
]
]
]
]
XML: {<?xml version="1.0"?>
<document>
<h>Heading</h>
<p class="Initial">Hi this is a <b>bold</b> paragraph.</p>
<p>this is a <span>stuff</span> paragraph</p>
<![CDATA[<greeting>Hello, world!</greeting>]]>
here's some more text <!-- A comment! -->
<p:p xmlns:p="http://www.uris.org/p">not a para</p:p>
<PERSON foo="bar">
<NAME bar="blech">Fred</NAME>
<AGE>24</AGE>
<ADDRESS>
<STREET>123 Main Street</STREET>
<CITY>Ukiah</CITY>
<STATE>CA</STATE>
</ADDRESS>
</PERSON>
</document>
}
[xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this
is
a" b "bold" "paragraph."] p ["this is a" span "stuff" "paragraph"]
<greet
ing>Hello, world!</greeting>
"here's some more text" p/p [xmlns/p
http://
www.uris.org/p
"not a para"] PERSON [#foo "bar" NAME [#bar "blech"
Fred
]
AGE "24" ADDRESS [STREET "123 Main Street" CITY "Ukiah" STATE "CA"]]]]
Andrew J Martin
Attendance Officer &
Information Systems Trouble Shooter
Colenso High School
Arnold Street, Napier.
Tel: 64-6-8310180 ext 826
Fax: 64-6-8336759
http://colenso.net/scripts/Wiki.r?AJM
http://www.colenso.school.nz/
DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally
liable) for materials distributed to or acquired from user e-mail accounts. You can report
any
misuse of an e-mail account to our ICT Manager and the complaint will be investigated.
(Misuse can come in many forms, but can be viewed as any material sent/received that
indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate
language and/or other issues described in our Acceptable Use Policy.)
All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0
Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]
[22/31] from: andrew:martin:colenso:school at: 7-Aug-2003 14:24
> This version handles name spaces better.
And this version has better formatting.
>> probe compress read %/c/Rebol/Values/Load-XML.r
#{
789CA5566D6FDB3610FE2CFF8A338320F5304BC9BA174048EB78498B6D48BAA0
319201861CD0126369964981A4E6BAE81FDABFDC9127C97692A5C0960F117577
3C3EF7F0B993A7BD8F62AE4A98F600FF3EF09588E1E852F16CF8C7D5A5B7BD2F
4AB41DB6B6507BEBA4B0CECC9C19D0CCBC755CDB5C69348FE137B8E2DA16921C
E28A17650CFEAC33A5532543296C283F7BEF9D98C7905B5BC551B45EAF43EDC2
C2C246B7BC2CC526F241BFAFA5F0A93F5B914A55AAC586727F2C16B935E83957
D546BB17F81BBE3B3E7E0D5B14DFC2CEB690F64DF81C777DEFD7D7B5AE94690B
32AE22E00638218654AD2A6E8B7929605EAA7409EA01FEE2652D4C93EB92CB45
CD178EBC7772511626273ED2A554EB52640BB112D281249E0336D71B2E19E0E3
ACB0B230CB10CF687CB7AA5C0A0D1F0A6BD29C81F4CF2126B5E60C594B5D1152
849968E27FD6C25AF885CB0CC97239F1F52C5599409A4D5AA8DA6C932774CF42
640ECC2172B65A2909D75C1B01B754926670785D9B3CD4F85455A869D305B758
DF4FD1B85ED4C6468E616FBF15DA144AC67012FE101E7B53D2EBADF852809AFF
2952DBC7AA03272D53F154DC93C81E6A99C2D4AD616AAC2EE4A20F6BA5B33E20
D3793F81C9A64257C62DB7B8EA2789CB1288C2E6C80DED18414C09D0D17A1E54
2DB3113C1432F372860316338A08ACA2E4B0E215AEB0E4080546612E8A30DD21
8A0E53732C6D267CCE4FA6C43F926D768FD9A5EB75EEC61BB756674C888EF39C
A3A06B89E4C1B8AC727E512C0A0B69EE905960E1F03E66143AC3CB9AFA18F882
150D997FDCB304B8DC409B0CF3BAFB44A5B97876DA1F62A0CD750D6C387CCBD0
7DFD2B7946AD7DE4CD57854945597229502C2EC2659D62301ED3664C30CEC1EA
A683A7CB3AF0AEC269CA6D9A27FFA58182C045772A78C08183F7CF26B9F069C8
0E4A7B8F8F2752CF95B4880C6E2C4F97F83EB61839AF6D2364E8DEEFFDFB0C43
26E2939DC185484BAEB9C33EA3DB77767857FA2E9DC13E1D8FCF8BC16BDBD7D1
C7F911041EC0BEF9E47817500320262DDCDD7C8384549B2D400F628637FAA695
EA019B314651540CAA8B6C6659543EC429E0E869CCD14E4802AFFCB39046680B
168730B4AC6991D5A968D5BDDF9E3BC80A636AD1A7207FCAAEF4075ED141D032
1BFB8B261A3B1D1FB0537767BBA43744A00E3FAD4A06AF682430F732F08A7ECC
1CDCDD8CBC589B82BAAB78B10AFFCF372DB99B5D4F2BC07FEDE537D01C68A276
3B61A8131B044185133252B2DC90FE1E21F91A947D2C0D98E715F623F906CD8D
FAD67CCC4FE209A2B359F496C68363CBA16C003A0883169F51AB2DD6B6C57117
F57C4B46E3FF02384BA6E717E3C978DAB0E21B06F5C692048F79ED25D732837A
7B00A92C885565372340A82BDAD01EF8BC225D481790B4AB4107627BB0474FBD
FCC29951594861FEFFC98D0147D269D489B55215F13AA0BB75FC3BCAFFB5EF4A
6E2C4E306DECB382F98A5E9AAFDB09BC8152C885CD471DF829256D5E7154B5AB
5D9DBB46A4AF6D17D755B0DF083D4FA51BB5233F7E3D3E5CB85EA3DF7A6DA4FB
767A6FF7298D526EC47657E048D9EDFBCEF864C292794F772FC409996DBFB03B
9DD47C5CF1E747D2FB07F7D30821DA0A0000
}
Andrew J Martin
Attendance Officer &
Information Systems Trouble Shooter
Colenso High School
Arnold Street, Napier.
Tel: 64-6-8310180 ext 826
Fax: 64-6-8336759
http://colenso.net/scripts/Wiki.r?AJM
http://www.colenso.school.nz/
DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally
liable) for materials distributed to or acquired from user e-mail accounts. You can report
any
misuse of an e-mail account to our ICT Manager and the complaint will be investigated.
(Misuse can come in many forms, but can be viewed as any material sent/received that
indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate
language and/or other issues described in our Acceptable Use Policy.)
All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0
Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]
[23/31] from: andrew:martin:colenso:school at: 7-Aug-2003 15:11
Will wrote:
> this doesn't:
> XML: {<?xml version="1.0"?>
<<quoted lines omitted: 3>>
> }
> Am I missing something?
Wait! Let me try:
>> XML: {<?xml version="1.0"?>
{ <docu-ment>
{ <h>Heading</h>
{ </docu-ment>
{ }
== {<?xml version="1.0"?>
<docu-ment>
<h>Heading</h>
</docu-ment>
}
>> load-xml xml
== [xml [#version "1.0"] docu-ment [h "Heading"]]
Seems OK here.
May be it's the version you've got? Or perhaps you're missing one of my
support Values?
>> probe compress read %/c/Rebol/Values/Load-XML.r
#{
789CA5566D6FDB3610FE2CFF8A338320F5304BC9BA174048EB78498B6D48BAA0
319201861CD0126369964981A4E6BAE81FDABFDC9127C97692A5C0960F117577
3C3EF7F0B993A7BD8F62AE4A98F600FF3EF09588E1E852F16CF8C7D5A5B7BD2F
4AB41DB6B6507BEBA4B0CECC9C19D0CCBC755CDB5C69348FE137B8E2DA16921C
E28A17650CFEAC33A5532543296C283F7BEF9D98C7905B5BC551B45EAF43EDC2
C2C246B7BC2CC526F241BFAFA5F0A93F5B914A55AAC586727F2C16B935E83957
D546BB17F81BBE3B3E7E0D5B14DFC2CEB690F64DF81C777DEFD7D7B5AE94690B
32AE22E00638218654AD2A6E8B7929605EAA7409EA01FEE2652D4C93EB92CB45
CD178EBC7772511626273ED2A554EB52640BB112D281249E0336D71B2E19E0E3
ACB0B230CB10CF687CB7AA5C0A0D1F0A6BD29C81F4CF2126B5E60C594B5D1152
849968E27FD6C25AF885CB0CC97239F1F52C5599409A4D5AA8DA6C932774CF42
640ECC2172B65A2909D75C1B01B754926670785D9B3CD4F85455A869D305B758
DF4FD1B85ED4C6468E616FBF15DA144AC67012FE101E7B53D2EBADF852809AFF
2952DBC7AA03272D53F154DC93C81E6A99C2D4AD616AAC2EE4A20F6BA5B33E20
D3793F81C9A64257C62DB7B8EA2789CB1288C2E6C80DED18414C09D0D17A1E54
2DB3113C1432F372860316338A08ACA2E4B0E215AEB0E4080546612E8A30DD21
8A0E53732C6D267CCE4FA6C43F926D768FD9A5EB75EEC61BB756674C888EF39C
A3A06B89E4C1B8AC727E512C0A0B69EE905960E1F03E66143AC3CB9AFA18F882
150D997FDCB304B8DC409B0CF3BAFB44A5B97876DA1F62A0CD750D6C387CCBD0
7DFD2B7946AD7DE4CD57854945597229502C2EC2659D62301ED3664C30CEC1EA
A683A7CB3AF0AEC269CA6D9A27FFA58182C045772A78C08183F7CF26B9F069C8
0E4A7B8F8F2752CF95B4880C6E2C4F97F83EB61839AF6D2364E8DEEFFDFB0C43
26E2939DC185484BAEB9C33EA3DB77767857FA2E9DC13E1D8FCF8BC16BDBD7D1
C7F911041EC0BEF9E47817500320262DDCDD7C8384549B2D400F628637FAA695
EA019B314651540CAA8B6C6659543EC429E0E869CCD14E4802AFFCB39046680B
168730B4AC6991D5A968D5BDDF9E3BC80A636AD1A7207FCAAEF4075ED141D032
1BFB8B261A3B1D1FB0537767BBA43744A00E3FAD4A06AF682430F732F08A7ECC
1CDCDD8CBC589B82BAAB78B10AFFCF372DB99B5D4F2BC07FEDE537D01C68A276
3B61A8131B044185133252B2DC90FE1E21F91A947D2C0D98E715F623F906CD8D
FAD67CCC4FE209A2B359F496C68363CBA16C003A0883169F51AB2DD6B6C57117
F57C4B46E3FF02384BA6E717E3C978DAB0E21B06F5C692048F79ED25D732837A
7B00A92C885565372340A82BDAD01EF8BC225D481790B4AB4107627BB0474FBD
FCC29951594861FEFFC98D0147D269D489B55215F13AA0BB75FC3BCAFFB5EF4A
6E2C4E306DECB382F98A5E9AAFDB09BC8152C885CD471DF829256D5E7154B5AB
5D9DBB46A4AF6D17D755B0DF083D4FA51BB5233F7E3D3E5CB85EA3DF7A6DA4FB
767A6FF7298D526EC47657E048D9EDFBCEF864C292794F772FC409996DBFB03B
9DD47C5CF1E747D2FB07F7D30821DA0A0000
}
You'll also need:
>> probe compress read %"/c/Rebol/Values/Common Parse Values.r"
#{
789CAD566D53DB4610FEACFB151BD10C21AD2D636801B5D3C4059CA4631A0FD0
3013233367F9B055E43B8D748A634A7F50FF6577EFF4666A7F6A3FC068F7D9D7
E7F6F63C629762A2621831E737BE103EEC9EAAC542C9D690A799687DE2712E32
E65C473A46D0B52018102CE832A71F11F87213DA4E11EFE57AAE5274EFC1AF70
C1531D49D47E5C4A61948F5A8452C56AB642ED65349BEBCCA44A562909F03774
3B9D03A89DBF83864F1B9DC4058F621F4C2B6F551A2AD99642B7E523736EC4C4
87B9D689EF79CBE5B29D924D3BD21ED6178B95877D0B31C584A397173C69A701
36CB27281F326798A789CAB6F73DE07296F319B1762E677194CD9973C635CA07
5E2F9FE599F6A872E67C12691629E943B7DD6D77981330D6C78AC798552B1072
0AD9439404CE8FD8649AC702F49C6BE0F192AF32B847CBACCDAEF230C44AC967
8B61662DD0F663A885F6219C53BD1A463BEEF855A7B3E7420BE8B3DFDF730376
16CDA28691DBD9EF1E1C7EFFC3D1F1896B312225530B01460AD8EF494207D688
DAB3113F63B4815A3E03B9051F11ECC5C99CFB904B24014C1830F616A8F218A9
B02EAAB32E4663CB68A0EB7E4591A74AEA54C5DB9BDFC7E6CDC79161E1BDF85A
66311136B4D727FBA29B7B72B9F62940CB655743FA02970D6EAEEA8CA998E6A1
80D1D510DE5F9B54BD0EA542A3D765C9F86D146F50C1E5AA90FB26F0AB4E6FCF
65DB234AB18C2329E0F41206FD8035A252903A2649EF529ECCA3B04107732845
77BF22E4E81CB319E571A7FAEA56F0F169A5AC0C4F6AEF930A3E39AF95FD52D9
DB5F9B3A1AFD611A49CD27B4322CEF458D9BBAADC9631FA4163391E2FCDBC367
67228C16DC5C22AB41EBB65BA001BB5052AC08DC71BF29B5D6A25B0E34A36B74
13C9A95AE235C32506127720845C4A85838093C4230944A5BAC79B26F0EE538D
1C6F579AF9ACAFD249349D0A59B3FBE7ADE7BF7EE3FEF4F3D35F8C0DF090EECE
E5146B288FEC892E7BC07E89B97CB8231C319A82F24C9B50561C63ADC18A17FC
41809AFC2142FD82CEF2B33231B0C96F5D8CBEE3B6DC00F6CB1651F61BFD3A54
E2EE75B41063183D33AA459598E16FEA02F41D28396B21A97A8E85A562A1BE08
58F0C42E5D2F56218F856771B8CF65082323E0646A3CF0D98B20A07A9DF27077
9F90E06405C608C3E3603857F84AE8FF3F8797E0AB6113C14191CA487E938527
68748852B318AA4D6014AE5563C7B91EAE005CF8AB669C92657A07C6A61AFA73
466806558CA2985A1E1D56553428779CA71AD9EE8CA18D31FD2B0ECF736164CF
9914342441D1F886191A8A7411692D7050C5D730CEA702AA3B0AD590DB675E9A
1F0976DB547E18B6AFE229EDFF5169045404029722E63AFA22EE86DC106E3CAD
35A2BD49A6E25C57A8A99C867ECDADE49462179CAE399A26CBC4D427D2B61660
834129FD6756CE52CC427BC63E527471829AABF176B2C60DB6C6E877FB2FBAC6
EB7C8D9F1366B7DBED06C6C6256503BA33770571B654588FB1C6CDD87211B07F
00243E18D2180A0000
}
>> probe compress read %"/c/Rebol/Values/Push.r"
#{
789C9D50CB4EC330103CDB5FB18D84B880531E97E644C543A2A280DAAA1CAA1C
DC76492C8C1D391BAA80F820FE92750A2A881BC7999DDD999D859CE0D25B5848
71AB9F3183FDFBA62EA59819B28C92881229AE4C447B1D5481896143A50F2C18
C208C63A9071CCDE6D1C76E42BE1CA79EB8B96D989294AAA993EF7551B22800F
38EEF74F60B77C003F76142FE1581B9B4197EECC879577CA2129F72AC5032E33
2889AA2C4D379B8D0A51A30CA5736D2DB629A7D74BF63B95E2BE0995AF39FADB
B5AB31500D1A5EB46D108C23CF804983CCBA3504A426B81AA8C46FBA44BD56EF
52DC685734BA88F55CBAC29AD8D08526C6478374D4D8368DEF4831C7501BEF98
567D35902297323696C163E356B1E3E4FF31B89329E9D5132CB66C0F2A1FA807
4B4335522F17C9F4EB08F184EF025F369D5DDC9D77760BEDDA436A2BECE590DE
39DB8A64C63EDB2CBFF439A78DBE8086930488E2F881D84A521FF13650D6DDE6
51BE9BFF197565E4F2137382C5E370020000
}
>> probe compress read %"/c/Rebol/Values/Pop.r"
#{
789CAD504D4F0231103D777EC54062BC982EA817F624F1E3404409183C100E05
CB6E636937ED2C048C3FC87FE97457233FC0DBBC37F3DEBC99054CF5CA5B5C80
78525B9DE3F9C457205E0C59065D065D100F2681B38464603CACA9F481DB431C
E15805328ED9E7BDD30D7924BD76DEFAE2C0ECD4142545A66F7D750809E0175E
F67A57F827BEC0138D64911E2B63736CA2DDF8B0F64E3A4DD21D41BCEA558E25
519567D97EBF9721CD4843D95C59AB0F1967572BDE770D625287CA474EFE31D5
540717914A8D1B1322E14ED95AA371A830EA607444E5DE30E8ADDF716D0837C1
6F9BF9B62D3F413C2A57D4AA484FBA778535B10471A788717F908D6A7BC8D25D
20E63A44E31DD3B22707209600FCB91C37B55B1337D2B3BBFF10891F3523B57E
C7454B74B0F2813AB83214357596A23BFBF121EE1847489E8BAA314AEA250799
A7AD27658E9561CBD6B80FA2DDDF6210F9EF38C012BE017850B5B63C020000
}
>> probe compress read %"/c/Rebol/Values/Map.r"
#{
789CA554CD4EDC30103E3B4F311B092D54DD840DD0432E94F2A31641A9680587
550EDEC49BB878EDC871D806C403F52D3BB69305AA02872A519C197FF3CDAF3D
0B2ED95C099805E42B5DB214C6E7B40EC80F6E040A210A61404EB81536AC1469
940F5A53298DDB07700AE7541B2E517BB192CC29EF0CCBA512AAEC507BC9CBCA
34A83E5475A7AD00BF21D9DEDE8147E3F7F0C4264223764EB948C185F651E95C
C9483213C9BB805CB3790A9531751AC7ABD52AD2161371135F51215817631E8C
15E870B671A0CB76C9A469229D614A748EDADD807C6B75AD1A4CE81EF3694069
A0752D386BC0540C16ADCC0D57128C0264042698E300B570FB0DD3088D1E0272
4665D9D2D2D6EC5896823755408E7FD1656D8B751F1082F48E0E6612DF763967
7A9465567A0712D72924B093418F1CC4572C9E216117F6E0C380A788AA3597C6
FEBE8E84F9132C0A6FA1217F864711AB49B00087D4B052E90E6BDD1A2E608AFA
83FC46AA956045E9CB96DAC922E1A96202B031A20BADB7F08C16BC11F436B4C2
27CD8C81CF5416D840AB419E23E44E21D98B4F5BD1C5765E0272C574839D4961
1AED45D3006101869D3EB60C5DFD574F6196539357360D5D4E315BD94D06DB51
8F1AF9DDE4E5DDF8428A0EC22F1255C607A059D30A33B85B87431BA08373CC3B
3E6931B8F048C9B1015E4AA5194825D9086EA9683D26C31C4F06FBEFCE148F89
63F74B03D74A178D479A4AABD544C909D31A6B625B8151BB95D842B81FF23491
7D485DEE3E2A2F250EB539B84D7B88F7EEA564CB61B2B7889367C4D37F1127CF
88A74F885D36B0A4370C5C3E23AC6BD1E6ACF7366E72CD6B03E39C4AA9CCA46D
6CE17F2A2E7B00B90F1F60A944E12E3908F17192E96A36E48D108F751EC90BF6
8F16C9DF16EE6B3F7D3B521F70DF2AC16469AAFD5E44946B560AEBAB0AD2A11A
B85928C8D5D25E5671C158EDC358E058D0BC82D9A6B3DDCA066E1F245FE0CC18
58B799901607D1EC63E14D6CB5E37E5CD653D4F378F066CDF31B98CDECDC6186
1E8BE71F6F094B6B27B44766037F6FC2DDBC835F628567C09BD8D3B005062FF4
F584F6B4AF95CDDF0278C2B3E00F1902F8BEA3060000
}
>> probe compress read %"/c/Rebol/Values/Arguments.r"
#{
789CAD91CD4EEB301085D7F6534C2B5DB181A4E2B2CA8ABF56085D7E54215854
5D4C936962D5B52B7B4208573C106FC938F752780096F3CD19CF9CE3859ED3CA
5B5868758B5B2AE0E02CD4ED961C47AD1E0C5B41E33D1A6B353309FDFA625910
7AD672E34392C235DC6060E384DE758E06F8CA543A6F7DDD0B9D9BBAE128F8C2
EFFA900A7887E3C9E4377C0D1FC2B7994C86E8068D2D6038F6D487D2BBCC1167
EE55AB275A15D030EF8A3CEFBA2E0B499319CE1FD15AEA73F1812BD977A2D57D
1B763E264B73E236B808DC10E0A715F0EB01AC5B57B2F10E3002C2CAFA72935A
9D0FD5089ED1B614D34D7FD0D52DD629B4A9ABAD898D56D317DCEE52427FB552
FB8CA0F896AA7A93BCCA8DF39DA5AAA68116E903D4F83C10335CA1ABE470D9A0
965A5D22CB73C727F9756BFB3CE5A4D5238528F709CE26D9448B4CEF1714C3F9
E9BD1F31398305BAFEE8533D5AC2F841A667FF6B512CD3AE86B082D2120610F9
E0666D5C056B132243318340520F6647D263F9CD7DEF9F4FF1B0D41F0CDBCED2
8F020000
}
> Thank you for the nice tool.
Thank you for trying it out and helping to make it better!
Andrew J Martin
Attendance Officer &
Information Systems Trouble Shooter
Colenso High School
Arnold Street, Napier.
Tel: 64-6-8310180 ext 826
Fax: 64-6-8336759
http://colenso.net/scripts/Wiki.r?AJM
http://www.colenso.school.nz/
DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally
liable) for materials distributed to or acquired from user e-mail accounts. You can report
any
misuse of an e-mail account to our ICT Manager and the complaint will be investigated.
(Misuse can come in many forms, but can be viewed as any material sent/received that
indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate
language and/or other issues described in our Acceptable Use Policy.)
All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0
Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]
[24/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 8:11
On Tue, 5 Aug 2003 22:04:47 +1200, "A J Martin" <[AJMartin--orcon--net--nz]>
said:
> Thanks, Bryan and Will!
> Bryan wrote:
<<quoted lines omitted: 12>>
> from
> MS Excel 2002):
As I said, I would still recommend building a DOM implementation over-top
of xml-parse or some other xml-parser implementation. There were (maybe
still are?) very real and significant shortcomings in REBOL's built-in
parser, and so I'd recommend that you need a better parser implementation
underneath your DOM.
That said, my xml-parse implementation is slower than REBOL's -- hey I'm
not the world's best REBOL developer and I basically brute-force
translated the EBNF grammar productions from the XML spec into REBOL's
most-excellent parse capability.
> XML: {<?xml version="1.0"?>
> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
<<quoted lines omitted: 23>>
> And set it with Rebol script like:
> XML/Workbook/DocumentProperties/Author: "Andrew Martin"
The xml-object script will let you do that. Check out the web-archived
docs at:
http://web.archive.org/web/20020210063622/www3.sympatico.ca/gavin.mckenzie/rebol/xml-object-info.html
> Also we should think about several tags at the same level of nesting,
> like
<<quoted lines omitted: 8>>
> attribute in the "DocumentProperties" tag?
> XML/Workbook/DocumentProperties/________
I haven't found that having a different syntax for addressing attributes
to be helpful. I just consider the attribute to be a child of the
element. Yes, in theory it is possible to have an attribute and a
child-element of the same name, but in practice I've never seen such an
XML file in five+ years of working with XML. Or, at least not within the
same namespace.
> [...]
Gavin.
[25/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 8:51
On Wed, 6 Aug 2003 15:03:10 +0200, "bryan" <[bry--itnisk--com]> said:
> >yeah I thought there might be problems, another problem is in the rare
> >occurrence of namespace prefixed attributes.
<<quoted lines omitted: 7>>
> </doc>
> note that the xml namespace does not need to be declared anywhere.
Yeah, well the xml: namespace is special. It is implied. Though, there
is acually a formal namespace URI for the xml namespace, so it *is*
possible for it to be declared in an XML file. The xml namespace is only
used for xml:lang and xml:space though. It is illegal for people to use
this namespace other than for xml:lang and xml:space.
Besides the xml namespace, others will be declared. Namespaces are
tricky to implement in a parser/dom. The following snippets of XML are
equivalent, though their syntax differs greatly:
<a:a xmlns:a="whatever">
<b xmlns="something-else">text<b>
</a:a>
<a xmlns="whatever">
<b:b xmlns:b="something-else">text<b:b>
</a>
<a:a xmlns:a="whatever">
<a:b xmlns:a="something-else">text<a:b>
</a:a>
Note the nasty third example where the namespace prefix 'a' is redeclared
locally for the element scope of 'b'.
In short, for a namespace implementation the prefix: part of a namespace
is almost entirely useless. It would be incorrect or at least very
dangerous to build code that path-ed into an XML with an assumption about
the prefixes: e.g. it would be bad to write a path like xml/a:a/b:b
because the prefixes a: and b: are only present in one of the several
ways to encode the same XML above.
Gavin.
[26/31] from: bry:itnisk at: 8-Aug-2003 15:55
>I haven't found that having a different syntax for addressing
attributes
>to be helpful. I just consider the attribute to be a child of the
>element. Yes, in theory it is possible to have an attribute and a
>child-element of the same name, but in practice I've never seen such an
>XML file in five+ years of working with XML. Or, at least not within
the
>same namespace.
Unfortunately I have to be able to differentiate between them. I'm also
more likely to handle more document-like structures, bibliographic
materials, TEI, Docbook, xhtml etc, as such structures are less regular
they increase paranoia.
So basically the following:
Xpath: "/body/a[@href and not(@class)]"
Is no more helpful than a path which treats attributes as elements? This
may be occasioned by my being set in my ways but I find it hard to
believe I wouldn't feel the lack almost immediately, and have that lack
grow even more annoying as time went on.
[27/31] from: bry:itnisk at: 8-Aug-2003 16:05
>Yeah, well the xml: namespace is special. It is implied. Though,
there
>is acually a formal namespace URI for the xml namespace, so it *is*
>possible for it to be declared in an XML file.
Right, hence I noted it need not be declared, not that it is never
declared ;)
> The xml namespace is only
>used for xml:lang and xml:space though. It is illegal for people to
use
>this namespace other than for xml:lang and xml:space.
Well it's reserved, so later versions may have more in that namespace.
>it would be bad to write a path like xml/a:a/b:b
>because the prefixes a: and b: are only present in one of the several
>ways to encode the same XML above.
Hence the need to bind a namespace declaration to a namespace prefix,
also the need to differentiate between name() and local-name() in xpath
or nodeName and baseName in DOM.
[28/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 9:25
On Fri, 8 Aug 2003 15:55:02 +0200, "bryan" <[bry--itnisk--com]> said:
> >I haven't found that having a different syntax for addressing
> attributes
<<quoted lines omitted: 8>>
> materials, TEI, Docbook, xhtml etc, as such structures are less regular
> they increase paranoia.
Ahh...yes, so indeed you've fallen outside the goal of xml-object which
is not intended to be used for document-oriented XML.
I've just started working with DocBook myself lately.
> [...]
A real-live XPath implementation in REBOL would be useful, though
*waayyy* non-trivial to build. Or, at least, building a compliant XPath
implementation is non-trivial. A few years ago I contracted out to a
company (Ginger Alliance in the Czech Republic (though I'm in Canada),
see www.gingerall.com...great bunch of folks to work with) to build an
XPath processor. They had lots of experience with XPath and XSLT and
still it was a multi-week full-time-developer effort to get the XPath
functionality working properly.
Gavin.
[29/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 9:28
> [...]
> >it would be bad to write a path like xml/a:a/b:b
<<quoted lines omitted: 3>>
> also the need to differentiate between name() and local-name() in xpath
> or nodeName and baseName in DOM.
Ahh, you've clearly demonstrated that know your XML.
Gavin.
[30/31] from: bry:itnisk at: 11-Aug-2003 11:56
>A real-live XPath implementation in REBOL would be useful, though
>*waayyy* non-trivial to build.
I think difficulty however can be lessened by how the object is
structured. I've been thinking that it needs to be one more level of
abstraction, that is to say that each step in the object tells you what
elements, attributes, namespaces, element name ,defaultNamespace, and
textnodes are contained at that step. With textnodes referenced inside
the elements block so that one can maintain their position.
Child1: make object![
Name: "h:p"
Elements: [t1 "h:p" "a" t2 "n:p"]
Attributes: ["class" "style"]
Namespaces: ["xmlns:h='http://www.w3.org/1999/xhtml'"]
defaultNamespace: "http://www.w3.org/1999/xhtml"
att1: "ParaStyle1"
att2: "position:relative;"
t1: "here's some text"
t2: "hi"
Child1: make object![
Name: "h:p"
.... and so on and so on
]
]
It is definitely not as elegant as just using Rebol path to navigate to
/p/p
but I think the abstraction offered here is necessary in order to handle
all the possible weird textnode, namespaceing issues.
[31/31] from: bry:itnisk at: 18-Aug-2003 13:26
>>A real-live XPath implementation in REBOL would be useful, though
>>*waayyy* non-trivial to build.
<<quoted lines omitted: 4>>
>textnodes are contained at that step. With textnodes referenced inside
>the elements block so that one can maintain their position.
Part of the reason for my thinking the REBOL XML engine should work as
stated above is that this is similar to how xmerl handles it
http://www.creado.com/xmerl_xs/userguide.html
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted