7

 

XML Classes

 

eXtensible Markup Language i.e. XML is a subset of the Standard Generalized Markup Language (SGML), which is an ISO standard numbered ISO 8879. SGML was perceived to be remarkably colossal and extremely convoluted to be put to any pragmatic use. Thus, a subset of this language, XML, was developed to work seamlessly with both SGML and HTML. XML may be considered as a restricted form of SGML, since it conforms to the rules of an SGML document.

 

XML was created in the year 1996 under the auspices of the World Wide Web Consortium (W3C), under the chairmanship of Jon Bosak. This group spelt out 10 ground rules for XML, with 'ease of use' as its fundamental philosophy. From thereon, the expectations reached a threshold wherein, XML was expected to eradicate world poverty and generally rid the world of all its tribulations. To be precise, XML was overvalued, way beyond realistic levels. There are people who appear to be extremely infatuated by XML, even though they may not have read through a single rule or specification of the language.

 

The specifications of XML laid down by its three primary authors- Tim Bray, Jean Paoli and C. M. Sperberg-McQueen, are accessible at the web site http://www.w3.org/XML.

 

 

XML documents consists entities comprising of Characters or Markups. An XML file is made up of a myriad components, which shall be unravelled one at a time, after we have discerned the basic concepts of this language. We commence this chapter by introducing a program that generates an XML file.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.Flush();

a.Close();

}

}

 

In this program, we use a class called XmlTextWriter, which comes from the System.Xml namespace. An instance 'a' of the XmlTextWriter class is created, by passing two parameters to the constructor:

   The first parameter, b.xml, is a string and represents the name of the file to be created. If the file exists in the current directory, it gets deleted and then recreated, but with zero bytes.

   The second parameter is null. It represents the Encoding type used.

 

Unicode is a standard whereby each character is assigned 16 bits.  All the languages in the world can now be easily represented by this standard. In the .Net world, we are furnished with classes whose methods facilitate conversion of arrays and strings made up of Unicode characters, to and from arrays made up of bytes alone.

 

The System.Text namespace has a large number of Encoding implementations, such as the following:

     The ASCII Encoding encodes the Unicode characters as 7-bit ASCII.

     The UTF8 Encoding class encodes Unicode characters using UTF-8 encoding.

UTF-8 stands for UCS Transformation Format 8 bit. It supports all Unicode characters. It is normally accessed as code page 65001. UTF-8 is the default value and represents all the letters from the English alphabet. Here, since we have specified the second parameter as null, the default value of UTF-8 encoding is taken.

 

If we execute the program at this stage, you would be amazed by the fact that no file by the name of b.xml will be displayed. To enable this to happen, a function named Flush needs to be called.

 

Each time we ask the class XmlTextWriter to write to a file, it may not oblige immediately, but may place the output in a buffer. Only when the buffer becomes full, will it write to the file. This approach is pursued to avoid the overhead of accessing the file on the disk repetitively. This improves efficiency. The Flush function flushes the buffer to the file stream, but it does not close the file.

 

The Close function has to be employed to execute the twin tasks of flushing the buffer to the file, and closing the file. It is sagacious to call Flush, and then call Close, even though Close is adequate to carry out both these tasks.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

 

Here, we have called a function called WriteStartDocument from the XmlTextWriter class, which does not take any parameters. It produces the line <?xml version="1.0"?>, in the file b.xml.

Any line that begins with <?xml is called an XML declaration. Every entity in XML is described as a node. Every XML file must begin with an XML Declaration node. There can be only one such node in our XML file and it must be placed on the first line. Following it is an attribute called version, which is initialized to a value of 1.0.

 

The XML specifications lucidly stipulate that there would be no attribute called version in the next version of the software. Even if there is, its value would be indeterminate. In other words, in the foreseeable future, the only mandatory attribute would be version=1.0.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.WriteDocType("vijay", null, null ,null);

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?><!DOCTYPE vijay>

 

The next vital declaration is the DOCTYPE declaration. Every XML file must have one DOCTYPE declaration, as it specifies the root tag. In our case, the root tag would be 'vijay'.

 

An XML file is made up of tags, which are words enclosed within angular brackets. The file also contains rules, which bind the tags. The next three parameters of the function WriteDocType are presently specified as null. You may refer to the documentation to decipher the remaining values, since these may be used in place of null. If this does not appeal to you, you may have to hold your horses, till we furnish the explanation at an appropriate time.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

 

In the earlier example, all the nodes were displayed on the same line. We would indubitably desire that every node be displayed on a new line. The property Formatting in XmlTextWriter, is used to accomplish this task. Formatting can be assigned only one of the following two values: Indented or None. By default, the value assigned is None.

 

The Indented option indents the child elements by 2 spaces. The magnitude of the indent may be altered, by stipulating a new value for the Indentation field. In our program, we want the indent to be 3 spaces deep. Hence, we stipulate the value as 3. As is evident, all nodes do not get indented. For example, the DOCTYPE node does not get indented; instead, it is placed on a new line.

 

The IndentChar property may be supplied with the character that is to be employed for indentation. By default, a space character is used for this purpose.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay />

 

The function WriteStartElement accepts a single parameter, which is the tag name, to be written to the XML file. This is an oft-repeated instruction, to be iterated in almost every program, since an XML file basically comprises of tags. A tag normally has a start point and an end point, and it confines entities within these two extremities. However, there are tags that do not accept any entities. Such tags end with a / symbol.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal" />

 

The newly added function WriteAttributeString accepts two parameters, which it writes in the form of a name-value pair. Thus, along with 'vijay', we see the attribute named 'wife', having a value of 'sonal'. An attribute is analogous to an adjective of the English language, in that, it describes the object. In our case, it describes the tag 'vijay'. It divulges additional information about the properties of a tag.

 

XML does not interpret the contents of these tags. The word 'wife' or the value 'sonal', have no special significance for XML, which is absolutely unconcerned about the information provided within the tags.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.WriteElementString("surname", "mukhi");

a.Flush();

a.Close();

}

}

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal">

   <surname>mukhi</surname>

</vijay>

 

An element represents entities within a tag. We have a tag surname containing the value 'mukhi'. We can have multiple tags within the root tag.

 

We have been reiterating the fact that we need to adhere to specific rules. You may steer clear of the beaten path and interchange the following two newly added functions as follows:

 

a.WriteElementString("surname", "mukhi");

a.WriteAttributeString ("wife","sonal");

 

As a fallout of this interchange, the following exception will be thrown:

 

Unhandled Exception: System.InvalidOperationException: Token StartAttribute in state Content would result in an invalid XML document.

 

This exception is triggered off due to the fact that the attribute must be specified first. Then, and only then, should the child tags within the tag, be specified.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.WriteAttributeString ("friend","two");

a.WriteElementString("surname", "mukhi");

a.WriteElementString("books", "67");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal" friend="two">

   <surname>mukhi</surname>

   <books>67</books>

</vijay>

 

To summarize, the WriteDocType function specifies the root tag, the WriteStartElement the tag, the WriteAttributeString, the attributes for the active tag and WriteElementString function, a tag within a tag. We can enumerate as many attributes as we desire. They will eventually be clustered together. The WriteElementString function is also capable of creating as many tags, as are needed under a tag.

 

In the file b.xml, we see two attributes and two tags, under the root tag 'vijay'.  

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("friend","two");

a.WriteStartElement("mukhi");

a.WriteAttributeString ("wife","sonal");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay friend="two">

   <mukhi wife="sonal" />

</vijay>

 

In the above example, 'vijay' is the root tag, with the attribute 'friend', which is assigned a value of 2. It also has a child tag 'mukhi' having the attribute of 'wife' initialized to 'sonal'. Both the tags, 'vijay' and 'mukhi', are created using the function WriteStartElement. Unlike function WriteElementString, which creates a start and end tag, WriteStartElement creates only a start tag. 

 

A tag too can be endowed with attributes. The active tag is the last inserted by the WriteStartElement function. Functions such as WriteAttributeString, act on the active tag. Thus, we notice that the attribute of 'wife' has the tag 'mukhi' and not 'vijay'. Finally, since the tag 'mukhi' is devoid of any contents, it ends with a / symbol on the same line.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("friend","two");

a.WriteStartElement("mukhi");

a.WriteAttributeString ("wife","sonal");

a.WriteFullEndElement();

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay friend="two">

   <mukhi wife="sonal">

   </mukhi>

</vijay>

 

The function WriteFullEndElement marks the end of the active tag. Therefore, the single tag 'mukhi', does not end with a / symbol on the same line. It has an ending tag instead. Both these possibilities are equally valid in this case. But, if the tags embody any contents, then both the start and the end tags are mandatory. In such situations, a single empty tag would just not suffice.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

//a.WriteComment("comment 1");

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteStartDocument();

a.WriteComment("comment 1");

a.WriteDocType("vijay", null, null ,null);

a.WriteComment("comment 2");

a.WriteStartElement("vijay");

a.WriteAttributeString ("wife","sonal");

a.WriteComment("comment 3");

a.WriteElementString("surname", "mukhi");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!--comment 1-->

<!DOCTYPE vijay>

<!--comment 2-->

<vijay wife="sonal">

   <!--comment 3-->

   <surname>mukhi</surname>

</vijay>

 

Every programming language extends the facility of writing comments, even though it may be a seldom used feature. Programmers insert comments amidst their code to document or explain the functioning of their programs. At times, comments assist in deciphering the code from the programmer's perspective. Practically, it may be easier to teach an elephant how to tap-dance, than to convince a programmer to write comments.

 

In the XML world, comments begin with <!-, and end with -->. This is somewhat similar to the HTML syntax. In fact, the rules of HTML are written in XML.

 

Comments are like a liquid, since they can be moulded to fit-in anywhere, except on the first line of a program. The first line in an XML file has to be a declaration. If you dispense with the comments given with the function WriteComment, an exception will be thrown with the following message:

 

Unhandled Exception: System.InvalidOperationException: WriteStartDocument should be the first call.

 

 

Thus, functions such as WriteComment, can be used to insert comments anywhere in the code, primarily for the purpose of documentation, which would enable even an alien from outer space to decipher the code better.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteProcessingInstruction ("sonal", "mukhi=no");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>

   <?sonal mukhi=no?>

</vijay>

 

A line beginning with <?, Is called a Processing Instruction (PI). This line is inserted using the function WriteProcessingInstruction, and is passed two parameters:

     the first is the name of the processing instruction.

     the second is the text that is to be inserted for the processing instruction.

 

A Procession Instruction is used by XML to communicate with other programs during the performance of certain tasks. XML does not have the wherewithal to execute instructions. It therefore delegates this task to the XML processor. The processor is a program that is able to recognise an XML file. When it encounters the processing instruction, and if it is able to understand it, it executes it. In cases where it cannot comprehend it, the processor simply ignores the instruction. This is the methodology by which XML communicates with external programs.

 

In our program, the instruction 'sonal' is ignored, as it does not provide any meaningful input to the processor.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteString("mukhi");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>mukhi</vijay>

 

An XML file mainly consists of strings and tags. The WriteString function is very extensively exploited, since it writes content/strings between tags.

 

In the above example, the text 'mukhi' is enclosed within the tags of 'vijay'. Even though we have not explicitly asked the XmlTextWriter class to close the tag, the ending tag has been used because there exists some content after the opening tag.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString ("friend","two");

a.WriteString("hi");

//a.WriteAttributeString ("friend","three");

a.WriteStartElement("mukhi");

a.WriteAttributeString ("friend","two");

a.WriteString("bye");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay friend="two">hi<mukhi friend="two">bye</mukhi></vijay>

 

The function WriteString can be inserted almost anywhere in the program. The first WriteString function writes 'hi' between the tags of 'vijay', while the second WriteString function writes 'bye' between the tags of 'mukhi'. The WriteString is aware of the active tag. Therefore, it inserts the text accordingly. Here also, if we uncomment the line, a.WriteAttributeString("friend","three"), the following exception will be generated.

 

Unhandled Exception: System.InvalidOperationException: Token StartAttribute in state Content would result in an invalid XML document.

 

XML is very strict and meticulous in the sense that, it expects a certain order to be maintained, or else, it throws an exception. For instance, an element or a tag has to be created first. Only then, can all the attributes be written; and finally, the text or content has to be supplied. We are not permitted to write the text first and enter the attributes later. In the XmlTextWriter class, there is no going back. It is a one-way path, which only moves in the forward direction.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteCharEntity ('A');

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>&#x41;</vijay>

 

During our exploratory journey of XML, we shall discuss a large number of characters that are 'reserved'. They have a special significance and cannot be used literally. These Unicode characters have to be written in a hex format. The function WriteCharEntity performs this task. It accepts a char or a Unicode character as a parameter and returns a number in hex, prefaced with the &# symbol.

 

For those who do not understand hexadecimal and consider it Greek and Latin, 41 hex is equal to ASCII 65, which is the ASCII value for the capital letter A. You can pass different characters to this function and see their equivalent hex values.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteCData("mukhi & <sonal>");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay><![CDATA[mukhi & <sonal>]]></vijay>

 

The above program introduces a new function called WriteCData, which creates a node called CDATA. The parameter passed to this function is placed as it is, but is enclosed within square brackets.

 

A CDATA section is used whenever we want to use characters such as <, >, & and the likes, in their literal sense, which would otherwise be mistaken for Markup characters. Thus, in the above program, the CDATA section that contains the symbol &, interprets it as the literal character &, and not as a special character. Also, <sonal> is not recognized as a tag in this section. A CDATA section cannot be nested within another CDATA section.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteString("<A>&");

a.WriteCData("<A>&");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>&lt;A&gt;&amp;

<![CDATA[<A>&]]>

</vijay>

 

This program illustrates certain characters that are special to XML. These are the obvious characters, such as <, > and &, since they are used whilst an XML file is being created. Thus, whenever XML comes across the following symbols, it replaces them with the symbols depicted against each:

     < is replaced with '&lt;'

     > is replaced with '&gt;'

     & is replaced with '&amp;'.

 

If the same string that contains the above mentioned special characters is placed within a CDATA statement, gets written verbatim, without any conversions.

 

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteEntityRef("Hi");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>&Hi;</vijay>

 

The entity ref is very straightforward to understand. The string passed to the function WriteEntityRef is placed in the XML file, preceded by a '&' sign and followed by a semi-colon. An entity ref in XML is equivalent to a variable. It is included to provide flexibility to the program.

 

Thus in the above code, a variable called 'hi' is created. The task of stating what 'hi' signifies, can be defined in the XML file.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteRaw("<A>&");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay><A>&</vijay>

 

The WriteRaw function writes the characters passed to it, without carrying out any conversions. The above XML file is obviously erroneous, as no end tag has been specified for the tag A. Also, no name has been specified after the & sign.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

Boolean b = true;

a.WriteElementString("Logical", XmlConvert.ToString(b));

Int32 c = -2147483648;

a.WriteElementString("SmallInt", XmlConvert.ToString(c));

Int64 d = 9223372036854775807;

a.WriteElementString("Largelong", XmlConvert.ToString(d));

Single e = ((Single)22)/((Single)7);

a.WriteElementString("Single", XmlConvert.ToString(e));

Double f = 1.79769313486231570E+308;

a.WriteElementString("Double", XmlConvert.ToString(f));

DateTime h = new DateTime(2001, 07, 08 ,22, 0, 30, 500);

a.WriteElementString("DateTime", XmlConvert.ToString(h));

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay>

   <Logical>true</Logical>

   <SmallInt>-2147483648</SmallInt>

   <Largelong>9223372036854775807</Largelong>

   <Single>3.142857</Single>

   <Double>1.7976931348623157E+308</Double>

   <DateTime>2001-07-08T22:00:30.5000000+05:30</DateTime>

</vijay>

 

The above example contains a plethora of data types such as, boolean, int, double and Data Time.

 

The XmlConvert class has a large number of static functions that help us convert one data type to another. One such function is the ToString function. For types such as int or long, the smallest and the largest values are used, in order to check the veracity of the ToString function.

 

The ToString function is overloaded to handle many more data types than we have shown. The point here is that, it is possible for us to convert any data type into a string and write it to disk. This factor gains immense importance when data is being received from a database, and requires to be converted into a string in an XML file.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");

a.WriteString("1-861003-78");

a.WriteEndAttribute();    

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy" />

 

In the above example, we have introduced the WriteStartAttribute function. As is apparent from its name, it starts an attribute. The first parameter to this function is 'hi', which is the namespace, to which the prefix of the attribute belongs. The second parameter 'mukhi' is the name of the attribute.

 

The names assigned to attributes and tags may not always result in a unique name. A programmer may inadvertently create a tag or an attribute with a name that already exists. How then does XML decide what the tag denotes?

 

To help resolve such potential conflicts, each tag or entity is prefaced with a name known as the namespace. This is followed by a colon sign. Normally, meaningful names are assigned, rather than words like 'hi'. Prefixes or namespaces like xmlns, are reserved by XML. The concept of namespaces in XML is identical to the concept of namespaces in C#.

 

The third parameter is a Uniform Resource Identifier (URI). This parameter reveals greater details about the location of the namespace. It informs XML that somewhere within the document, additional information about the namespace 'hi' is available. In this case it is at xxx:yyy. As the WriteStartAttribute function does not specify any value for the attribute, the WriteString function is employed to assign the value 1-861003-78, to the attribute 'mukhi' in the namespace 'hi'.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("xmlns", "bk", null, "sonal:wife");

string p = a.LookupPrefix("sonal:wife");

a.WriteStartAttribute(p, "mukhi", "sonal:wife");

a.WriteString("sonal");

a.WriteEndAttribute();    

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay xmlns:bk="sonal:wife" bk:mukhi="sonal" />

 

Here, the function WriteAttributeString is called with four parameters. The first, as always, is the name of the namespace, i.e. xmlns. The second is the name of the attribute i.e. bk, which is suffixed to the name of the namespace, as xmlns:bk. The third parameter is the namespace URI. In the earlier program, we had specified the value of xxx:yyy for the URI. For this program, since the namespace xmlns is a reserved namespace, the URI parameter is specified as null. The last parameter is the value of the attribute.

As a consequence, the above function takes the form of an attribute consisting of xmlns:bk=sonal:wife. The next function LookupPrefix, accepts a namespace URI and returns the prefix. As the parameter supplied to this function is sonal:wife, the prefix returned is bk, which is stored in p.

 

The WriteStartAttribute then uses the following:

     'bk' as the namespace,

     'mukhi' as the name of the attribute, and

     'sonal:wife' as the namespace URI.

 

Thus, the attribute 'mukhi' is prefaced with the namespace 'bk'. Finally, the WriteString function assigns the value of 'sonal' to the attribute bk:mukhi.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("xmlns", "bk", null, "sonal:wife");

a.WriteAttributeString("jjj", "bk", "kkk", "sonal:wife");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay xmlns:bk="sonal:wife" jjj:bk="sonal:wife" xmlns:jjj="kkk" />

 

In this version of the WriteAttributeString function, the namespace is jjj and the attribute name is bk, with the value sonal:wife. Thus, the attribute becomes jjj:bk=sonal:wife. The third parameter to the function is the namespace URI, which is now assigned a value of kkk, instead of null.

 

Thus, one more attribute xmlns:jjj gets added, which indicates that the namespace URI is kkk. We notice that this attribute does not get added for the xmlns namespace. We have chosen the attribute name 'bk' again, just to demonstrate that they belong to different namespaces. Therefore, this bk is considered to be a different attribute from the earlier bk.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteStartAttribute(null,"sonal", null);

a.WriteQualifiedName("mukhi", "http://vijaymukhi.com");

a.WriteEndAttribute();

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay sonal="n1:mukhi" xmlns:n1="http://vijaymukhi.com" />

 

In the WriteStartAttribute function, only the second parameter out of the three parameters, has a value 'sonal, which is the name of the attribute. The first parameter, which is the name of the namespace and the third parameter, which is the URI of the namespace, are both assigned null values.

 

The next function, WriteQualifiedName assigns a value to the attribute 'sonal'. This function takes two parameters, the value 'mukhi' and the namespace URI for the value.

 

The value 'mukhi' gets prefaced by a namespace n1, which is created dynamically by XML. The name n1 belongs to the reserved xmlns namespace and the URI to n1 is specified in the second parameter, http://vijaymukhi.com. The method WriteQualifiedName, then looks up the prefix within the scope for the given namespace.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteStartElement("vijay");

a.WriteAttributeString("xmlns","mukhi",null,"xxx:yyy");

a.WriteString("Hi ");

a.WriteQualifiedName("sonal","xxx:yyy");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<vijay xmlns:mukhi="xxx:yyy">Hi mukhi:sonal</vijay>

 

In this example, we first create an attribute 'mukhi' in the reserved namespace xmlns. This attribute is then rendered a value of xxx:yyy. The WriteString function writes 'Hi' as the content and then, the WriteQualifiedName writes the string 'sonal'. However, since 'sonal' is a Qualified name, it is prefaced by 'mukhi' and not by xxx:yyy, because 'mukhi' is equated to xxx:yyy.

 

The prefix in the scope for the namespace is given precedence.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteStartElement("vijay");

a.WriteElementString("vijay","mukhi");

a.WriteElementString("vijay","sonal","mukhi");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<vijay>

  <vijay>mukhi</vijay>

  <vijay xmlns="sonal">mukhi</vijay>

</vijay>

 

As we have just observed, the WriteElementString function had only two parameters in the earlier program. However, here it has three parameters. The first and the third parameters are the same, i.e. the tag name and the value. The newly inducted second parameter indicates the namespace 'sonal'. The tag in the first parameter 'vijay', has the namespace of sonal. Thus, the XML file contains the tag with the attribute of xmlns=sonal.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter (Console.Out);

a.WriteStartDocument();

a.WriteStartElement("vijay");

a.Close();

}

}

 

Output

<?xml version="1.0" encoding="IBM437"?><vijay />

 

The XmlTextWriter class can write to different entities, using the constructor that accepts a single parameter. The Console class has a static property out of datatype TextWriter that represents the console. Thus, the output is now displayed on the console. By default, the encoding attribute is assigned a value of IBM437.

 

One of the primary reasons for designing XML was to introduce validation of the tags in order to produce a well-evolved XML file.

 

There are a few validations that need to be performed in an XML file, such as:

   It should be ensured that the basic rules of XML as well as our indigenous rules are followed.

   Certain tags should be placed only within specified tags and cannot be used independently.

   The number of times a tag is being used can be regulated, since it cannot be used infinite times.

   A check should be placed on the name and the number of times an attribute is used within a tag.

 

All such rules that need to be enforced are enunciated in XML parlance and then, placed in a DTD or a Document Type Description. The DTD may either be placed in a separate file or may be made part of the DOCTYPE declaration. In the XML file shown below, the DTD is internal.

 

Thus, a DTD stores the grammar that is permissible in an XML file. The entity refs are also defined in a DTD. One of the reasons why HTML is also reffered to as XHTML is that, the rules of well-formed html are available in the form of a DTD.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

String s = "<!ELEMENT vijay (#PCDATA)>";

a.WriteDocType("vijay", null, null, s);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay[<!ELEMENT vijay (#PCDATA)>]>

<vijay />

 

The WriteDocType function accepts four parameters. The first parameter is the starting or root tag 'vijay'. Hence, it must contain a value. The last parameter is the subset (as referred to by the documentation), which follows the root tag 'vijay'. If you observe the DOCTYPE statement carefully, you will notice that an extra pair of square brackets [], have been added.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteDocType("vijay", null, "a.dtd", null);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay SYSTEM "a.dtd">

<vijay />

 

The third parameter to WriteDocType function specifies the name of the DTD file. In other words, it states the URI of the DTD. The second parameter is assigned the value of null. Hence, the word SYSTEM is displayed before the name of the file, in the XML file.

 

Whenever XML wishes to ensure the validity of an XML file, it ascertains the rules from a.dtd. If both internal and external DTDs are present, both of them are checked. However, the internal DTD is accorded priority.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.WriteDocType("vijay", "mmm", "a.dtd", null);

a.WriteStartElement("vijay");

a.Flush();

a.Close();

}

}

 

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay PUBLIC "mmm" "a.dtd">

<vijay />

 

In the earlier program, SYSTEM was added in the XML file, since the second parameter had been specified as null. However, in this program, the second parameter is not null. Hence, the word PUBLIC gets added. Thereafter, the string or the id specified in the second parameter is added. And then, the dtd in the third parameter is specified.

 

Therefore, it is either the PUBLIC identifier or the SYSTEM identifier, which would be present. The XML program or the processor scanning the XML file, uses the PUBLIC identifier to retrieve the content for the entities that use the URI. If it fails, it falls back upon to the SYSTEM literal.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument(false);

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0" standalone="no"?>

 

The WriteStartDocument can take a boolean parameter that adds an attribute which could either be 'standalone = yes' or 'standalone=no', depending upon the value specified. This attribute determines whether the DTD is in an external file or it is internal to the XML file. If the standalone has a value of 'yes', it is suggestive of the fact that there is no external DTD, and therefore, all the grammatical rules have to be placed within the XML file itself.

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

System.Console.WriteLine(a.WriteState);

a.WriteStartDocument();

System.Console.WriteLine(a.WriteState);

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

System.Console.WriteLine(a.WriteState);

a.WriteStartElement("vijay");

System.Console.WriteLine(a.WriteState);

a.WriteAttributeString ("wife","sonal");

System.Console.WriteLine(a.WriteState);

a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");

System.Console.WriteLine(a.WriteState);

a.WriteString("1-861003-78");

a.WriteElementString("surname", "mukhi");

a.Flush();

System.Console.WriteLine(a.WriteState);

a.Close();

System.Console.WriteLine(a.WriteState);

}

}

 

Output

Start

Prolog

Prolog

Element

Element

Attribute

Content

Closed

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay wife="sonal" hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy">

   <surname>mukhi</surname>

</vijay>

 

The XmlTextWriter object can be in any one of six different states. The WriteState property reveals its current state. When an XmlTextWriter Object is created, it is in the Start state, as may be evident from the fact that, no write method has been called so far. After the Close function, the Writer is in the Closed state. When the WriteStartDocument and WriteDocType functions are called, they reach the Prolog state, because the prolog is being written.

 

The WriteStartElement function actually starts writing to the XML file, thereby, morphing to the Element state. The element start tag 'vijay' begins the XML file. The next function WriteAttributeString does not change the state, since the element in focus still is 'vijay'. The WriteStartAttribute function needs the WriteString to complete the attribute. Thus, after the WriteStartAttribute function executes, the Text Writer assumes the Attribute mode. The surname attribute becomes the content in the XML file. Hence, the state changes to Content mode.

 

This goes on to prove that the TextWriter can possibly be in any one of the above six states, depending upon the entities written to the file. While the TextWrtier is in the Attribute state, it cannot switch to an element state to write an element. Therefore, it throws an exception.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.Namespaces = false;

a.WriteStartDocument();

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("jjj", "bk", "kkk", "sonal:wife");

a.Flush();

a.Close();

}

}

 

Output

Unhandled Exception: System.ArgumentException: Cannot set the namespace if Namespaces is 'false'.

at System.Xml.XmlTextWriter.WriteStartAttribute(String prefix, String localName, String ns)

at System.Xml.XmlWriter.WriteAttributeString(String prefix, String localName, String ns, String value)

at zzz.Main()

 

The TextWriter class has a Namespaces property that is read-write, and it has a default value of true. The Namespace property is turned off, by setting this property to false. The above runtime exception is thrown because, we have attempted to introduce a namespace jjj, in the WriteAttributeString function.

 

a.cs

using System;

using System.Xml;

public class zzz {

public static void Main() {

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.QuoteChar = '\'';

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteAttributeString("jjj", "bk");

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay jjj='bk' />

Various facets of XML can be modified. By using the property QuoteChar, we can modify the default quoting character, from double inverted commas to single inverted commas. Since a single quote cannot be enclosed within a set of single quotes, we use the backslash to escape it. All attributes can now be placed in single quotes instead of double quotes.

 

a.cs

using System;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextWriter a = new XmlTextWriter ("b.xml", null);

a.WriteStartDocument();

a.Formatting = Formatting.Indented;

a.Indentation = 3;

a.WriteDocType("vijay", null, null ,null);

a.WriteStartElement("vijay");

a.WriteStartAttribute("hi", "mukhi", "xxx:yyy");

a.WriteString("1-861003-78");

a.WriteEndAttribute();    

a.WriteEndElement();

a.WriteEndDocument();

a.Flush();

a.Close();

}

}

 

b.xml

<?xml version="1.0"?>

<!DOCTYPE vijay>

<vijay hi:mukhi="1-861003-78" xmlns:hi="xxx:yyy" />

 

Good programming style necessitates every 'open' to have a corresponding 'close'. Thus, the Begin functions for an Element, Attribute and Document have corresponding Close functions too. However, if we do not End them, they close by default and no major calamity befalls them. We are using them in the above program as an abandon caution.

The WriteEndDocument function puts the Text Writer in the Start mode.

 

Reading an XML file

 

b.xml

<?xml version="1.0" standalone="yes"?>

<!DOCTYPE vijay SYSTEM "a.dtd" [<!ENTITY baby "No">]>

<vijay aa="no">

<!--comment 2--><?sonal  mukhi=no?>

Hi&baby;

<![CDATA[,mukhi>]]><aa>bb</aa>

</vijay>

 

> copy con a.dtd

Enter

^Z

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextReader r;

r = new XmlTextReader("b.xml");

while (r.Read())

{

Console.Write("{0} D={1} L={2} P={3} ", r.NodeType, r.Depth, r.LineNumber, r.LinePosition );

Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);

Console.WriteLine();

}

}

}

 

Output

XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2

Whitespace D=0 L=1 P=39 name= value= AC=0

DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1

Whitespace D=0 L=2 P=54 name= value= AC=0

Element D=0 L=3 P=2 name=vijay value= AC=1

Whitespace D=1 L=3 P=16 name= value= AC=0

Comment D=1 L=4 P=5 name= value=comment 2 AC=0

ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0

Text D=1 L=4 P=36 name= value=Hi AC=0

EntityReference D=1 L=5 P=4 name=baby value= AC=0

Whitespace D=1 L=5 P=9 name= value= AC=0

CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0

Element D=1 L=6 P=21 name=aa value= AC=0

Text D=2 L=6 P=24 name= value=bb AC=0

EndElement D=1 L=6 P=28 name=aa value= AC=0

Whitespace D=1 L=6 P=31 name= value= AC=0

EndElement D=0 L=7 P=3 name=vijay value= AC=0

 

In this program, we read an XML file and display all the nodes contained therein. To avoid any errors from being displayed, you should create an empty file by the name of a.dtd.

 

We have a class called XmlTextReader that accepts a filename as a parameter. We pass the filename b.xml to it. This file contains most of the entities present in an XML file. The Read function in this class picks up a single node or XML entity at a time. It returns true, if there are more nodes to be read, or else, it returns false. Thus, when there are no more nodes to be read from the file, the while loop ends. The Read function scans the active node and displays its contents in the loop.

 

The NodeType property displays the name of the nodetype. As an XML file normally starts with a declaration, the NodeType property displays the NodeType as XMLDeclaration, using the ToString function.

 

The Depth property gets incremented by one, every time an element or a tag is encountered. At the Declaration statement, the depth is 0. At the EndElement or at the end of the tag, its value reduces by one. Thus, the Depth property reveals the number of open tags in the file and it can be used for indentation.

The Line Number indicates the line on which the statement is positioned, while the LinePosition property displays the position on the line at which the statement begins. The Name property in the class reveals the name of the tag, XML. The output displayed by this property depends upon the active node type. On acute observation, you shall notice that the word XML is not preceded by the symbol <? in the output.

 

The value property relates to the name property, in this case, to XmlDeclaration. It displays the entire gamut of attributes to the node. As there exist two attributes, version and standalone, the property AttributeCount displays a value of 2.

 

If the enter key is pressed after the node declaration, it is interpreted as a Whitespace character. Whitespace characters are separators, which could consist of an enter, space et al. The Position property specifies the character position as 39.

 

The XmlDeclaration has to be the first node in an XML file, and it cannot have any children. The DOCTYPE declaration, which is known as a DocumentType Node, displays the name as vijay, which is the root node. The value is displayed as <!ENTITY baby "No">, which includes everything except the SYSTEM and a.dtd. Thus, in the case of a DocumentType Node, value is the internal DTD.

 

We shall encounter the Whitespace Node very frequently. Hence, we shall not discuss it hereinafter. The Attribute Count will be displayed in the next program. This node can have the Notation and Entity as child nodes.

 

The next node in sequence is our very first element or tag 'vijay', which is the same value that was displayed earlier, with the name property for the DocumentType Node. The Value property for this element shows null, since tags are devoid of Values. Instead, they have Attributes.

 

The attribute Count displays a value of one. At the following Whitespace node, the Depth property gets incremented by one. This is the only way to ascertain whether we are at the root node or not. We now stumble upon a comment, which has no name. The value displayed is the value of the comment. And yet again, the <!-characters are not displayed along with the value.

 

Thereafter, a processing instruction (PI) is encountered. No whitespace is displayed between the comment and the PI, since we have not pressed the Enter key. 'Sonal' becomes the name of the program that runs 'vijay'. The rest turns into the value property having no attributes. TextNode is displayed next because the text 'Hi' is displayed in the XML file. This node too is not assigned any name and the value is depicted as 'Hi'.

 

What follows the text is an Entity Reference. It is assigned the name 'baby' and is devoid of the ampersand sign. Its value is null and it does not have any attributes. The CDATA section is given the name as null. The value is assigned the content of the CDATA, after stripping away the square brackets.

 

The value of the Depth property is incremented by 1. The Text Node follows the element aa. This node does not have any name and it displays the value as 'bb'. In the following program, we explore the various attributes.

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r;

r = new XmlTextReader("b.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

while (r.Read())

{

Console.Write("{0} D={1} L={2} P={3}",r.NodeType,r.Depth,r.LineNumber,r.LinePosition);

Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);

Console.WriteLine();

if (r.HasAttributes)

{

for ( int i =0; i < r.AttributeCount; i++)

{

r.MoveToAttribute(i);

System.Console.WriteLine("Att {0}={1}",r.Name,r[i]);

}

}

}

}

}

 

Output

XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2

Att version=1.0

Att standalone=yes

DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1

Att SYSTEM=a.dtd

Element D=0 L=3 P=2 name=vijay value= AC=1

Att aa=no

Comment D=1 L=4 P=5 name= value=comment 2 AC=0

ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0

Text D=1 L=4 P=36 name= value=

Hi AC=0

EntityReference D=1 L=5 P=4 name=baby value= AC=0

CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0

Element D=1 L=6 P=21 name=aa value= AC=0

Text D=2 L=6 P=24 name= value=bb AC=0

EndElement D=1 L=6 P=28 name=aa value= AC=0

EndElement D=0 L=7 P=3 name=vijay value= AC=0

 

A property called WhiteSpaceHandling is initialized to None, as a result of which, the node Whitespace is not visible in the output.

 

The XmlTextReader has a member HasAttributes, which returns a True value if the node has attributes and False otherwise. Alternatively, we could also have used the property AttributeCount to obtain the number of attributes that the node contains.

 

 

If the node has attributes, a 'for statement' is used to display all of them. In the loop, we first use the function MoveToAttribute to initially activate the attribute. This is achieved by passing the number as a parameter to the function. Bear in mind that the index starts from Zero and not One.

 

Thereafter, the Name property is used to display the name of the attribute. If the attribute is not activated, the Name property displays the name of the node. This explains the significance of the MoveToAttribute function.

 

As you would recall, the XmlTextReader class has an indexer for the attributes, and like all indexers, it is zero based, i.e. r[0] accesses the value of the first attribute. This is how we display the details of all attributes of the node.

 

For the node DOCTYPE, the SYSTEM becomes the name of the attribute and the value becomes the name of the DTD file. For an element, the attributes are specified in name-value pairs.

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static XmlTextReader r;

public static void Main()

{

r = new XmlTextReader("b.xml");

int declaration=0, pi=0, doc=0, comment=0, element=0, attribute=0, text=0, whitespace=0,cdata=0,endelement=0,

entityr=0,entitye=0,entity=0,swhitespace=0,notation=0;

while (r.Read())

{

Console.Write("{0} D={1} L={2} P={3}",r.NodeType,r.Depth,r.LineNumber,r.LinePosition);

Console.Write(" name={0} value={1} AC={2}",r.Name,r.Value,r.AttributeCount);

Console.WriteLine();

if (r.HasAttributes)

{

for ( int i =0; i < r.AttributeCount; i++)

{

r.MoveToAttribute(i);

System.Console.WriteLine("Att {0}={1}",r.Name,r[i]);

}

}

switch (r.NodeType)

{

case XmlNodeType.XmlDeclaration:

declaration++;

break;

case XmlNodeType.ProcessingInstruction:

pi++;

break;

case XmlNodeType.DocumentType:

doc++;

break;

case XmlNodeType.Comment:

comment++;

break;

case XmlNodeType.Element:

element++;

if (r.HasAttributes)

attribute += r.AttributeCount;

break;

case XmlNodeType.Text:

text++;

break;

case XmlNodeType.CDATA:

cdata++;

break;

case XmlNodeType.EndElement:

endelement++;

break;

case XmlNodeType.EntityReference:

entityr++;

break;

case XmlNodeType.EndEntity:

entitye++;

break;

case XmlNodeType.Notation:

notation++;

break;

case XmlNodeType.Entity:

entity++;

break;

case XmlNodeType.SignificantWhitespace:

swhitespace++;

break;

case XmlNodeType.Whitespace:

whitespace++;

break;

}

}

Console.WriteLine ();

Console.WriteLine("XmlDeclaration: {0}",declaration);

Console.WriteLine("ProcessingInstruction: {0}",pi);

Console.WriteLine("DocumentType: {0}",doc);

Console.WriteLine("Comment: {0}",comment);

Console.WriteLine("Element: {0}",element);

Console.WriteLine("Attribute: {0}",attribute);

Console.WriteLine("Text: {0}",text);

Console.WriteLine("Cdata: {0}",cdata);

Console.WriteLine("EndElement: {0}",endelement);

Console.WriteLine("Entity Reference: {0}",entityr);

Console.WriteLine("End Entity: {0}",entitye);

Console.WriteLine("Entity: {0}",entity);

Console.WriteLine("Whitespace: {0}",whitespace);

Console.WriteLine("Notation: {0}",notation);

Console.WriteLine("Significant Whitespace: {0}",swhitespace);

}

}

 

Output

XmlDeclaration D=0 L=1 P=3 name=xml value=version="1.0" standalone="yes" AC=2

Att version=1.0

Att standalone=yes

Whitespace D=0 L=1 P=39 name= value=

 AC=0

DocumentType D=0 L=2 P=11 name=vijay value=<!ENTITY baby "No"> AC=1

Att SYSTEM=a.dtd

Whitespace D=0 L=2 P=54 name= value=

 AC=0

Element D=0 L=3 P=2 name=vijay value= AC=1

Att aa=no

Whitespace D=1 L=3 P=16 name= value=

 AC=0

Comment D=1 L=4 P=5 name= value=comment 2 AC=0

ProcessingInstruction D=1 L=4 P=19 name=sonal value=mukhi=no AC=0

Text D=1 L=4 P=36 name= value=

Hi AC=0

EntityReference D=1 L=5 P=4 name=baby value= AC=0

Whitespace D=1 L=5 P=9 name= value=

 AC=0

CDATA D=1 L=6 P=10 name= value=,mukhi> AC=0

Element D=1 L=6 P=21 name=aa value= AC=0

Text D=2 L=6 P=24 name= value=bb AC=0

EndElement D=1 L=6 P=28 name=aa value= AC=0

Whitespace D=1 L=6 P=31 name= value=

 AC=0

EndElement D=0 L=7 P=3 name=vijay value= AC=0

Whitespace D=0 L=7 P=9 name= value=

 AC=0

 

XmlDeclaration: 0

ProcessingInstruction: 1

DocumentType: 0

Comment: 1

Element: 1

Attribute: 0

Text: 2

Cdata: 1

EndElement: 2

Entity Reference: 1

End Entity: 0

Entity: 0

Whitespace: 6

Notation: 0

Significant Whitespace: 0

 

The above program is a continuation from where we left off in the previous program. The initial portion of the code is identical. A colossal case statement is introduced in the program to check the NodeType.

 

For each Node Type, there is a corresponding variable, whose value is incremented by 1 whenever the Node Type matches. Then, the values contained in these variables are displayed. For inexplicable reasons, the NodeType property does not return the following node types - Document, DocumentFragment, Entity, EndEntity, or Notation.

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r = new XmlTextReader("b.xml");

r.WhitespaceHandling = WhitespaceHandling.None;

while (r.Read())

{

if (r.HasValue)

Console.WriteLine("{0}  {1}={2}", r.NodeType, r.Name, r.Value);

else

Console.WriteLine("{0} {1}", r.NodeType, r.Name);

}          

}

}

 

Output

XmlDeclaration  xml=version="1.0" standalone="yes"

DocumentType  vijay=<!ENTITY baby "No">

Element vijay

Comment  =comment 2

ProcessingInstruction  sonal=mukhi=no

Text  =

Hi

EntityReference baby

CDATA  =,mukhi>

Element aa

Text  =bb

EndElement aa

EndElement vijay

 

The HasValue property simply identifies whether a Node can contain a value or not. There are nine nodes that can possess values. These nodes are Attribute, CDATA, Comment, DocumentType, ProcessingInstruction, Significant Whitespace, Whitespace, Text and XmlDeclaration. All the above nodes must have a value, but they need not necessarily have a name.

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main()

{

XmlTextReader r = new XmlTextReader("b.xml");

r.MoveToContent();

string s = r["mukhi"];

Console.WriteLine(s);

s = r.GetAttribute("sonal");

Console.WriteLine(s);

s = r[2];

Console.WriteLine(s);

}

}

 

b.xml

<vijay mukhi="no" sonal="yes" aaa="bad" />

 

Output

no

yes

bad

 

The MoveToContent function moves to the first element in the XML file.

 

 

In this program, we display the attributes using different methods. In the first approach, the indexer is passed a string, which is the name of the attribute 'mukhi'. It receives 'no' as the return value.

 

In the second approach, the indexer is passed the integer value 2 as a parameter, to access the value of the third attribute, which is 'bad'.

 

Alternatively, the WriteAttribue function could have been given the string 'sonal' as a parameter, to return the value of the attribute as 'yes'. Thus, there are multiple means to achieving the same objective.

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz

{

public static void Main() {

XmlTextReader r = new XmlTextReader("b.xml");

r.MoveToContent();

string s ;

s = r.GetAttribute("aa:bb");

Console.WriteLine(s);

s = r.GetAttribute("bb");

Console.WriteLine(s);

s = r.GetAttribute("bb","sonal:mukhi");

Console.WriteLine(s);

s = r.GetAttribute("bb","sonal:mukhi");

Console.WriteLine(s);

s = r.GetAttribute("bb","aa");

Console.WriteLine(s);

s = r.GetAttribute("xmlns:aa");

Console.WriteLine(s);

}

}

 

b.xml

<vijay xmlns:aa="sonal:mukhi" aa:bb="no" />

 

Output

no

 

no

no

 

sonal:mukhi

 

The MoveToContent function is used in this program, instead of the Read function. In the file b.xml, we have an attribute bb in the namespace aa. It is initialized to a value of 'no'. The namespace aa has a URI, sonal:mukhi, because of the xmlns declaration. Thus, the full name of the attribute becomes aa:bb i.e. prefix, followed by the colon, followed by the actual name. As a result, specifying aa:bb results in the display of 'no', but only specifying bb as a parameter to GetAttribute results in a null value.

 

The full name of an attribute includes the name of the namespace too. So, we can use the second form of the GetAttribute function that has an overload of two parameters, where the second parameter is the name of the URI and not the namespace. Hence, it is acceptable to call the function with the URI sonal:mukhi, but if we use the namespace aa, no output will be produced.

 

The last GetAttribute utilizes the full name xmlns:aa to retrieve the URI for the element. Thus, we can use this variant of the GetAttribute function with the URI instead of the namespace:name.

 

a.cs

using System;

using System.IO;

using System.Xml;

public class zzz {

public static void Main() {

XmlTextReader r = new XmlTextReader("b.xml");

r.WhitespaceHandling=WhitespaceHandling.None;

r.MoveToContent();

r.MoveToAttribute("cc");

Console.WriteLine(r.Name + " " + r.Value);

Console.WriteLine(r.ReadAttributeValue());

Console.WriteLine(r.Name + " " + r.Value);

}

}

 

b.xml

<vijay aa="hi" bb="bye" cc="no" />

 

Output

cc no

True

 No

 

In this example, we directly focus on the attribute that we are interested in, i.e. cc. The name and value properties in XMLTextReader display 'cc' and 'no' respectively. As there are numerous attributes of the node remaining to be read, the ReadAttribute function returns True. This function is normally used to read text or entity reference nodes that constitute the value of the attribute.

 

The Name property of the XmlTextReader however becomes null after the function ReadAttributeValue is called.