Meet XmlReader

Meet XmlReader

Summary

The XmlReader class is one of the key technologies that compose .NET, which greatly facilitates the operation of Xml by developers. Through this article, you will have a good understanding of XmlReader and apply it to actual development.

1. Summary

The XmlReader class is an abstract base class that provides non-cached, read-only access to XML data . This class conforms to the W3C Extensible Markup Language (XML) 1.0 and the namespace recommendations in XML.

The XmlReader class supports reading XML data from a stream or file . The methods and properties defined by this class allow you to browse the data and read the contents of the node.

The XmlReader class is an abstract class , and the XmlTextReader, XmlValidatingReader, and XmlNodeReader classes all inherit from the XmlReader class. The XmlReader class has many methods and attributes to read the content of the XML file, find the depth of the XML element, determine whether the content of the current element is empty, and navigate the attributes of the XML.

2. Create Xml Reader

We can create an XmlReader instance through the Create method, or configure the XmlReader object through the XmlReaderSettings class. Use the properties of the XmlReaderSettings class to enable or disable specific functions of the XmlReader object, and then pass the XmlReaderSettings object to the Create method.

MSDN recommends: Although in the .NET Framework version 2.0, the Microsoft .NET Framework includes specific implementations of the XmlReader class, such as XmlTextReader, XmlNodeReader, and XmlValidatingReader, but we recommend that you use the Create method to create an XmlReader instance.

By using the Create method and the XmlReaderSettings class, you will get the following benefits:

  • You can specify the functions to be supported on the created XmlReader object.
  • The XmlReaderSettings class can be reused to create multiple reader objects. You can use the same settings to create multiple readers with the same functions. In addition, you can modify the XmlReaderSettings object and create a new reader with a different feature set.
  • Functions can be added to existing readers. The Create method can accept other XmlReader objects. The base XmlReader object can be a user-defined reader or XmlTextReader object, or it can be another XmlReader instance to add additional functionality.
  • Take full advantage of all the new features added in the XmlReader class of the .NET Framework 2.0 version. Certain features can only be used on XmlReader objects created by the Create method, such as better consistency checking and compliance with XML 1.0 recommendations.

Tip: For the property settings of the XmlReaderSettings class, please refer to: http://msdn.microsoft.com/zh-cn/library/9khb6435(v=vs.80).aspx

 Instantiate XmlReader:

1 XmlReaderSettings settings = new XmlReaderSettings();
2 settings.ConformanceLevel = ConformanceLevel.Fragment;
3 settings.IgnoreWhitespace = true;
4 settings.IgnoreComments = true;
5 XmlReader reader = XmlReader.Create("books.xml", settings);

3. Access to external resources

The XmlResolver class is used to locate and access any resources required by the XmlReader object. XmlResolver can be used to perform the following operations:

  • Locate and open the XML example document.
  • Locate and open any external resources referenced by the XML instance document. This can include entities, document type definitions, architectures, etc.
  • If the resource is stored on a system that requires authentication, the System.Xml.XmlResolver.Credentials property can be used to specify the necessary credentials.

Note: If XmlResolver is not specified, the reader created will use the default XmlUrlResolver without user credentials. XmlUrlResover parses external XML resources named by Uniform Resource Identifiers (URI) and is the default parser for all classes in the System.Xml namespace .

The following code creates an instance of XmlReader, using XmlUrlResolver with default credentials.

1//Create a resolver with default credentials.
2 XmlUrlResolver resolver = new XmlUrlResolver();
3 resolver.Credentials = System.Net.CredentialCache.DefaultCredentials;
4   
5//Set the reader settings object to use the resolver.
6 settings.XmlResolver = resolver;
7 
8//Create the XmlReader object.
9 XmlReader reader = XmlReader.Create("http://ServerName/data/books.xml", settings);

4. Read the data

Reading data is the ultimate goal of processing XML files, so it is also the most important part of this article. The following will discuss in detail how to read Xml data through XmlReader.

4.1 Current node position

The XmlReader class provides forward-only access to an XML stream or file. The current node is the XML node where the reader is currently located. All methods called and operations performed are related to the current node, and all retrieved attributes reflect the value of the current node.

The reader advances by calling a read method (read method). Calling the read method repeatedly can move the reader to the next node. Such calls are usually executed within a While loop.

The following example shows how to locate in the stream to determine the current node type.

 1 reader.MoveToContent();
 2//Parse the file and display each of the nodes.
 3 while (reader.Read()) {
 4 switch (reader.NodeType) {
 5 case XmlNodeType.Element:
 6 Console.Write("<{0}>", reader.Name);
 7 break;
 8 case XmlNodeType.Text:
 9 Console.Write(reader.Value);
10 break;
11 case XmlNodeType.CDATA:
12 Console.Write("<![CDATA[{0}]]>", reader.Value);
13 break;
14 case XmlNodeType.ProcessingInstruction:
15 Console.Write("<?{0} {1}?>", reader.Name, reader.Value);
16 break;
17 case XmlNodeType.Comment:
18 Console.Write("<!--{0}-->", reader.Value);
19 break;
20 case XmlNodeType.XmlDeclaration:
21 Console.Write("<?xml version='1.0'?>");
22 break;
23 case XmlNodeType.Document:
24 break;
25 case XmlNodeType.DocumentType:
26 Console.Write("<!DOCTYPE {0} [{1}]", reader.Name, reader.Value);
27 break;
28 case XmlNodeType.EntityReference:
29 Console.Write(reader.Name);
30 break;
31 case XmlNodeType.EndElement:
32 Console.Write("</{0}>", reader.Name);
33 break;
34} 
35}

Tip: XmlNodeType is the node type. For more information, please refer to http://msdn.microsoft.com/zh-cn/library/3k5w5zc3(v=vs.80).aspx

4.2 read elements

The following table describes the methods and attributes provided by the XmlReader class for processing elements.

Member name

Description

IsStartElement

Check whether the current node is a start tag or an empty element tag.

ReadStartElement

Check whether the current node is an element and advance the reader to the next node.

ReadEndElement

Check if the current node is the end tag and advance the reader to the next node.

ReadElementString

Read plain text elements.

ReadToDescendant

Advance the XmlReader to the next child element with the specified name.

ReadToNextSibling

Advance the XmlReader to the next sibling element with the specified name.

IsEmptyElement

Check whether the current element contains an empty element tag. This attribute allows you to determine the difference between: <item num="123"/> (IsEmptyElement is true.) <item num="123"> (IsEmptyElement is false, although the element content is empty.) In other words, IsEmptyElement only reports whether the element in the source document contains an end element tag.

  • <item num="123"/> (IsEmptyElement is true.)
  • <item num="123">(IsEmptyElement is false, although the content of the element is empty.)

In other words, IsEmptyElement only reports whether the element in the source document contains an end element tag.

The following code uses the ReadStartElement and ReadString methods to read the element.

 1 using (XmlReader reader = XmlReader.Create("book3.xml")) {
 2 
 3//Parse the XML document. ReadString is used to 
 4//read the text content of the elements.
 5 reader.Read(); 
 6 reader.ReadStartElement("book");  
 7 reader.ReadStartElement("title");   
 8 Console.Write("The content of the title element: ");
 9 Console.WriteLine(reader.ReadString());
10 reader.ReadEndElement();
11 reader.ReadStartElement("price");
12 Console.Write("The content of the price element: ");
13 Console.WriteLine(reader.ReadString());
14 reader.ReadEndElement();
15 reader.ReadEndElement();
16 
17}

4.3 Reading attributes

The XmlReader class provides various methods and properties to read properties. Attributes are most common on elements. However, attributes are also allowed on XML declaration and document type nodes.

When located on an element node, use the MoveToAttribute method to browse the attribute list of the element. After calling MoveToAttribute, the node attributes (such as Name, NamespaceURI, Prefix, etc.) will reflect the attributes of the attribute, rather than the attributes of the containing element it belongs to.

The following table describes the methods and properties specifically designed for handling properties.

Member name

Description

AttributeCount

Get the list of attributes of the element.

GetAttribute

Get the value of the attribute.

HasAttributes

Gets a value indicating whether the current node has any attributes.

IsDefault

Gets a value indicating whether the current node is an attribute generated from the default value defined in the DTD or schema.

Item

Get the value of the specified attribute.

MoveToAttribute

Move to the specified attribute.

MoveToElement

Move to the element that has the current attribute node.

MoveToFirstAttribute

Move to the first attribute.

MoveToNextAttribute

Move to the next attribute.

ReadAttributeValue

Analyze the attribute value into one or more Text, EntityReference, or EndEntity nodes.

Example 1: Use the AttributeCount property to read all the attributes of an element.

1//Display all attributes.
2 if (reader.HasAttributes) {
3 Console.WriteLine("Attributes of <" + reader.Name + ">");
4 for (int i = 0; i <reader.AttributeCount; i++) {
5 Console.WriteLine(" {0}", reader[i]);
6}
7//Move the reader back to the element node.
8 reader.MoveToElement(); 
9 }

Example 2: Use the MoveToNextAttribute attribute in the While loop to read all the attributes of an element.

1 if (reader.HasAttributes) {
2 Console.WriteLine("Attributes of <" + reader.Name + ">");
3 while (reader.MoveToNextAttribute()) {
4 Console.WriteLine(" {0}={1}", reader.Name, reader.Value);
5}
6//Move the reader back to the element node.
7 reader.MoveToElement();
8 }

Example 3: Get the value of an attribute by name.

1 reader.ReadToFollowing("book");
2 string isbn = reader.GetAttribute("ISBN");
3 Console.WriteLine("The ISBN value: "+ isbn);

Tip: The ReadToFollowing method means reading until an element with the specified qualified name is found. Using this method can improve the speed of finding named elements in XML documents. If it finds a matching element, it advances the reader to the next subsequent element that matches the specified name and returns true.

4.4 Read content

1. Use the Value property

The Value property can be used to get the text content of the current node. The value returned depends on the node type of the current node. The following table describes what is returned for each possible node type.

Node type

value

Attribute

The value of the attribute.

CDATA

The contents of the CDATA section.

Comment

The content of the comment.

DocumentType

Internal subset.

ProcessingInstruction

All contents (not including instruction target).

SignificantWhitespace

The white space between any tags in the mixed content model.

Text

The content of the text node.

Whitespace

The space between the tags.

XmlDeclaration

The content of the statement.

All other node types

Empty string.

2. Use the ReadString method

The ReadString method returns the content of the element or text node in the form of a string .

If XmlReader is located on an element, ReadString concatenates all text, valid blanks, blanks, and CDATA section nodes, and returns the concatenated data in the form of element content. When any mark is encountered, the reader stops. This can happen in a mixed content model or when reading the end tag of an element.

If XmlReader is located on a text node, ReadString will perform the same concatenation of text, valid whitespace, whitespace, and CDATA section nodes. The reader stops at the first node that does not belong to the previously named type. If the reader is positioned on the attribute text node, ReadString has the same function as when the reader is positioned on the element start tag. It returns all the text nodes of the elements chained together.

3. Use the ReadInnerXml method

The ReadInnerXml method returns all the contents of the current node (including tags). The current node (start tag) and the corresponding end node (end tag) are not returned. For example, if it contains the XML string <node>this<child id="123"/></node>, ReadInnerXml will return this<child id="123"/>.

Node type

initial position

XML fragment

return value

After

Element

On the start tag of item1.

<item1>text1</item1><item2>text2</item2>

text1

On the start tag of item2.

Attribute

On the attr1 attribute node.

<item attr1="val1" attr2="val2">text</item>

val1

Remain on the attr1 attribute node.

If the reader is positioned on a leaf node, calling ReadInnerXml is equivalent to calling Read.

4. Use the ReadOuterXml method

The ReadOuterXml method returns all the XML content of the current node and all its children, including tags. Its behavior is similar to ReadInnerXml, except that it also returns the start tag and the end tag at the same time.

Using the values ​​in the above table, if the reader is on the start tag of item1, ReadOuterXml will return <item1>text1</item1>. If the reader is located on the attr1 attribute node, ReadOuterXml will return attr1="val1".

5. A simple example

Analyze the data of the menu food.xml and display it in a certain format.

The data format of food.xml is as follows:

 1 <?xml version="1.0" encoding="utf-8" ?> 
 2 <breakfast_menu>
 3 <food>
 4 <name>Belgian Waffles</name> 
 5 <price>$5.95</price> 
 6 <description>two of our famous Belgian Waffles with plenty of real maple syrup</description> 
 7 <calories>650</calories> 
 8 </food>
 9 <food>
10 <name>Strawberry Belgian Waffles</name> 
11 <price>$7.95</price> 
12 <description>light Belgian waffles covered with strawberries and whipped cream</description> 
13 <calories>900</calories> 
14 </food>
15 </breakfast_menu>

C# code:

 1 using System;
 2 using System.Collections.Generic;
 3 using System.Linq;
 4 using System.Text;
 5 using System.Xml;
 6 
 7 namespace myXmlReader
 8 {
 9 class Program
10 {
11 static void Main(string[] args)
12 {
13 XmlReader reader = XmlReader.Create(@"E:\kemi\CodeNow\Project\XmlReader\food.xml");//Create XmlReader instance
14           
15 while (reader.Read())
16 {
17 if (reader.NodeType.Equals(XmlNodeType.Element))//determine the node type
18 {
19 switch (reader.Name)
20 {
21 
22 case "breakfast_menu":
23 Console.WriteLine("===========breakfast menu==========");
24 break;
25                         case "name":
26                             Console.WriteLine("Name:{0}", reader.ReadString());//使用ReadString读取数据
27                             break;
28                         case "price":
29                             Console.WriteLine("Price:{0}", reader.ReadString());
30                             break;
31                         case "description":
32                             Console.WriteLine("Description:{0}", reader.ReadInnerXml());//使用ReadInnerXml读取数据
33                             break;
34                         case "calories":
35                             Console.WriteLine("Description:{0}", reader.ReadInnerXml());
36                             break;
37                         default:
38                             Console.WriteLine("");
39                             break;
40                     }
41                 }
42             }
43 
44             Console.Read();
45         }
46     }
47 }

输出结果:

Reference:https://cloud.tencent.com/developer/article/1115912 认识XmlReader - 云+社区 - 腾讯云