My notes about XML nodes, XmlNode.InnerText and XmlNode.InnerXml

2008-04-30


**XmlNode.InnerText Property **:  Gets or sets the concatenated values of the node and all its child nodes.

The concatenated values of the node and all its child nodes.

Setting this property replaces all the child nodes with the parsed contents of the given string.

For leaf nodes, InnerText returns the same content as the Value property.

This property is a Microsoft extension to the Document Object Model (DOM).

The following example compares the InnerText and InnerXml properties.

using System; using System.Xml; public class Test {

public static void Main() { XmlDocument doc = new XmlDocument(); doc.LoadXml("<root>"+ "<elem>some text<child/>more text</elem>" + "</root>");

XmlNode elem = doc.DocumentElement.FirstChild;

// Note that InnerText does not include the markup. Console.WriteLine("Display the InnerText of the element..."); Console.WriteLine( elem.InnerText );

// InnerXml includes the markup of the element. Console.WriteLine("Display the InnerXml of the element..."); Console.WriteLine(elem.InnerXml);

// Set InnerText to a string that includes markup. // The markup is escaped. elem.InnerText = "Text containing <markup/> will have char(<) and char(>) escaped."; Console.WriteLine( elem.OuterXml );

// Set InnerXml to a string that includes markup. // The markup is not escaped. elem.InnerXml = "Text containing <markup/>."; Console.WriteLine( elem.OuterXml ); } }

XmlNode.InnerXml Property : Gets or sets the markup representing only the child nodes of this node.

**XmlNode.OuterXml Property **: Gets the markup representing this node and all its child nodes.

Property Value The markup containing this node and all its child nodes.

NoteNote:OuterXml does not return default attributes.

RemarksRemarks This property is a Microsoft extension to the Document Object Model (DOM).

ExampleExample

The following example compares output from the InnerXml and OuterXml properties.

using System; using System.IO; using System.Xml;

public class Sample {

public static void Main() {

XmlDocument doc = new XmlDocument(); doc.LoadXml("<book genre='novel' ISBN='1-861001-57-5'>" + "<title>Pride And Prejudice</title>" + "</book>");

XmlNode root = doc.DocumentElement;

// OuterXml includes the markup of current node. Console.WriteLine("Display the OuterXml property..."); Console.WriteLine(root.OuterXml);

// InnerXml does not include the markup of the current node. // As a result, the attributes are not displayed. Console.WriteLine(); Console.WriteLine("Display the InnerXml property..."); Console.WriteLine(root.InnerXml);

} }


CreateElement:

using System; using System.IO; using System.Xml;

public class Sample {

public static void Main() {

// Create the XmlDocument. XmlDocument doc = new XmlDocument(); string xmlData = "<book xmlns:bk='urn:samples'></book>";

doc.Load(new StringReader(xmlData));

// Create a new element and add it to the document. XmlElement elem = doc.CreateElement("bk", "genre", "urn:samples"); elem.InnerText = "fantasy"; doc.DocumentElement.AppendChild(elem);

Console.WriteLine("Display the modified XML..."); doc.Save(Console.Out);

} }

**CreateTextNode : **

using System; using System.IO; using System.Xml;

public class Sample { public static void Main() { //Create the XmlDocument. XmlDocument doc = new XmlDocument(); doc.LoadXml("<book genre='novel' ISBN='1-861001-57-5'>" + "<title>Pride And Prejudice</title>" + "</book>");

//Create a new node and add it to the document. //The text node is the content of the price element. XmlElement elem = doc.CreateElement("price"); XmlText text = doc.CreateTextNode("19.95"); doc.DocumentElement.AppendChild(elem); doc.DocumentElement.LastChild.AppendChild(text);

Console.WriteLine("Display the modified XML..."); doc.Save(Console.Out);

} }

XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes.

Here are some examples of location paths using the unabbreviated syntax:

*

child::para selects the para element children of the context node *

child::* selects all element children of the context node *

child::text() selects all text node children of the context node *

child::node() selects all the children of the context node, whatever their node type *

attribute::name selects the name attribute of the context node *

attribute::* selects all the attributes of the context node *

descendant::para selects the para element descendants of the context node *

ancestor::div selects all div ancestors of the context node *

ancestor-or-self::div selects the div ancestors of the context node and, if the context node is a div element, the context node as well *

descendant-or-self::para selects the para element descendants of the context node and, if the context node is a para element, the context node as well *

self::para selects the context node if it is a para element, and otherwise selects nothing *

childchapter/descendantpara selects the para element descendants of the chapter element children of the context node *

child::*/child::para selects all para grandchildren of the context node *

/ selects the document root (which is always the parent of the document element) *

/descendant::para selects all the para elements in the same document as the context node *

/descendantolist/childitem selects all the item elements that have an olist parent and that are in the same document as the context node *

child::para[position()=1] selects the first para child of the context node *

child::para[position()=last()] selects the last para child of the context node *

child::para[position()=last()-1] selects the last but one para child of the context node *

child::para[position()>1] selects all the para children of the context node other than the first para child of the context node *

following-sibling::chapter[position()=1] selects the next chapter sibling of the context node *

preceding-sibling::chapter[position()=1] selects the previous chapter sibling of the context node *

/descendant::figure[position()=42] selects the forty-second figure element in the document *

/childdoc/childchapter[position()=5]/child::section[position()=2] selects the second section of the fifth chapter of the doc document element *

childpara[attributetype="warning"] selects all para children of the context node that have a type attribute with value warning *

childpara[attributetype='warning'][position()=5] selects the fifth para child of the context node that has a type attribute with value warning *

childpara[position()=5][attributetype="warning"] selects the fifth para child of the context node if that child has a type attribute with value warning *

childchapter[childtitle='Introduction'] selects the chapter children of the context node that have one or more title children with string-value equal to Introduction *

childchapter[childtitle] selects the chapter children of the context node that have one or more title children *

child::*[selfchapter or selfappendix] selects the chapter and appendix children of the context node *

child::*[selfchapter or selfappendix][position()=last()] selects the last chapter or appendix child of the context node

** Nodes and the Hierarchical Tree Structures They Comprise**

Every component of an HTML code in a web page is a Node, or a leaf/branch if you look at it from the tree-like hierarchical point of view. A node can be of either of the following two types: an Element Node or a Text Node.

Element Nodes

Element nodes are tags encompassed in angled brackets. A good example would be XML/HTML elements such the paragraph tag <p>, image tag <img> etc. Element nodes can contain child nodes and attributes within them.

Text Nodes

Text nodes are simply strings of text. They can't contain child nodes or attributes.

Let's take a look at the following example:

<p>DOM</p>

This short line of code is comprised of two nodes: the paragraph element node, and the "DOM" text node. Since the text node is inside the element node it is its child.

If we take a look at its hierarchical structure it would look like this:

<p> (element node, parent node of "DOM") | | DOM (text node, child node of <p>)

And how about this line of code:

<p><b>DOM</b></p>

Here we have an additional element node, the HTML Bold tag <b>. Since this element is placed inside the paragraph element it is its child. The "DOM" text is now encapsulated inside the bold tag so it is the child of the <b> element node:

<p> (element node, parent node of <b>) | | <b> (element node, child node of <p>, parent node of "DOM") | | DOM (text node, child node of <b>)

Let's complicate things even more:

<p>This is a <b>DOM</b> Tutorial</p>

Now we still have the same two element nodes, but now we have three text nodes:

The bold tag as well as the two text nodes "This is a" and "Tutorial" are the children of the paragraph tag:

<p> (element node, parent node of <b>, "This is a", and "Tutorial") | |

                  |               |              |               |

This is a (text node,       <b> (element node,     Tutorial (text node, child node of <p>)  |  child node of <p>,           child node |  parent node of <p>)            of <p>) | DOM (text node, child node of <b>)

Node Properties

The two most common properties of nodes are the nodeName and nodeValue:

NodeName The name value of Element nodes is the name of the tag: <img>,<b> etc. The name value of "text" nodes is just "text".

NodeValue The value for Text Nodes is the actual string of text. Element nodes don t have a value and this property is set to Null.


The next example uses the following data.

<?access-control allow="*"?> <r> <ch1/> <ch2/>

</r>

Prior to revision 3, the document root element (r) is considered to have just two children: the element nodes with nodeNames "ch1" and "ch2". In revision 3 and later, the document root element has five children:

  1. The text node for the new line and spaces between the root element and the element node with nodeName "ch1".
  2. The element node with nodeName "ch1".
  3. The text node for the new line and spaces between the element node with nodeName "ch1" and the element node with nodeName "ch2".
  4. The element node with nodeName "ch2".
  5. The text node for the two new lines between the element node with nodeName "ch2" and the closing tag of the root element.

What is the difference between XmlNode.InnerText and XmlNode.InnerXml

InnerXml serializes all child nodes so you get the markup for those child nodes e.g. if you have <gods><god><name>Kibo</name></god><gods> and you access the InnerXml of the DocumentElement you get <god><name>Kibo</name></god> InnerText concatenates the text content of all text nodes contained so in the above case InnerText of the DocumentElement gives you Kibo

Note that followup is set to microsoft.public.dotnet.xml.


function makeScrollableTable(tbl){ var pNode, w, hdr;

pNode = tbl.parentNode; w = document.createElement('<SPAN style="height:500px; overflow: auto;">'); pNode.insertBefore(w, tbl); w.appendChild(tbl);

for (var i=0; i < tbl.tHead.rows.length; i++)    setRowWidth(tbl.tHead.rows[i].cells); setRowWidth(tbl.tBodies(0).rows[0].cells);

hdr = tbl.cloneNode(false); hdr.id += 'Header'; hdr.appendChild(tbl.tHead.cloneNode(true)); tbl.tHead.style.display = 'none'; pNode.insertBefore(hdr, w); w.style.width = w.clientWidth + 50;

w2 = document.createElement('<table border=0 cellspacing=0 cellpadding=0 id="' + tbl.id + '">'); w2.id = tbl.id + 'Wrapper'; pNode.insertBefore(w2, hdr);

w2.insertRow().insertCell().appendChild(hdr); w2.insertRow().insertCell().appendChild(w);

w2.align = tbl.align; tbl.align = hdr.align = 'left'; hdr.style.borderBottom = tbl.style.borderTop = 'medium none'; }

function setRowWidth(r){ var c; for (var i=0; i < r.length; i++){ c = r[i]; c.width = c.clientWidth - parseInt(c.currentStyle.paddingLeft) - parseInt(c.currentStyle.paddingRight); } }