Meaning of - <?xml version="1.0" encoding="utf-8"?>

People also ask

What does UTF-8 mean in XML?

Unicode Transformation Format, 8-bit encoding form is designed for ease of use with existing ASCII-based systems and enables use of all the characters in the Unicode standard.

What does UTF-8 encoding means?

UTF-8 (UCS Transformation Format 8) is the World Wide Web's most common character encoding. Each character is represented by one to four bytes. UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character.

What does XML encoding mean?

Advertisements. Encoding is the process of converting unicode characters into their equivalent binary representation. When the XML processor reads an XML document, it encodes the document depending on the type of encoding.

What does the 8 stands for on UTF-8?

UTF-8 is a variable-width character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode (or Universal Coded Character Set) Transformation Format – 8-bit.

To understand the "encoding" attribute, you have to understand the difference between bytes and characters.

Think of bytes as numbers between 0 and 255, whereas characters are things like "a", "1" and "Ä". The set of all characters that are available is called a character set.

Each character has a sequence of one or more bytes that are used to represent it; however, the exact number and value of the bytes depends on the encoding used and there are many different encodings.

Most encodings are based on an old character set and encoding called ASCII which is a single byte per character (actually, only 7 bits) and contains 128 characters including a lot of the common characters used in US English.

For example, here are 6 characters in the ASCII character set that are represented by the values 60 to 65.

Extract of ASCII Table 60-65
╔══════╦══════════════╗
║ Byte ║  Character   ║
╠══════╬══════════════║
║  60  ║      <       ║
║  61  ║      =       ║
║  62  ║      >       ║
║  63  ║      ?       ║
║  64  ║      @       ║
║  65  ║      A       ║
╚══════╩══════════════╝

In the full ASCII set, the lowest value used is zero and the highest is 127 (both of these are hidden control characters).

However, once you start needing more characters than the basic ASCII provides (for example, letters with accents, currency symbols, graphic symbols, etc.), ASCII is not suitable and you need something more extensive. You need more characters (a different character set) and you need a different encoding as 128 characters is not enough to fit all the characters in. Some encodings offer one byte (256 characters) or up to six bytes.

Over time a lot of encodings have been created. In the Windows world, there is CP1252, or ISO-8859-1, whereas Linux users tend to favour UTF-8. Java uses UTF-16 natively [see comments].

One sequence of byte values for a character in one encoding might stand for a completely different character in another encoding, or might even be invalid.

For example, in ISO 8859-1, â is represented by one byte of value 226, whereas in UTF-8 it is two bytes: 195, 162. However, in ISO 8859-1, 195, 162 would be two characters, Ã, ¢.

Think of XML as not a sequence of characters but a sequence of bytes.

Imagine the system receiving the XML sees the bytes 195, 162. How does it know what characters these are?

In order for the system to interpret those bytes as actual characters (and so display them or convert them to another encoding), it needs to know the encoding used in the XML.

Since most common encodings are compatible with ASCII, as far as basic alphabetic characters and symbols go, in these cases, the declaration itself can get away with using only the ASCII characters to say what the encoding is. In other cases, the parser must try and figure out the encoding of the declaration. Since it knows the declaration begins with <?xml it is a lot easier to do this.

Finally, the version attribute specifies the XML version, of which there are two at the moment (see Wikipedia XML versions. There are slight differences between the versions, so an XML parser needs to know what it is dealing with. In most cases (for English speakers anyway), version 1.0 is sufficient.

An XML declaration is not required in all XML documents; however XHTML document authors are strongly encouraged to use XML declarations in all their documents. Such a declaration is required when the character encoding of the document is other than the default UTF-8 or UTF-16 and no encoding was determined by a higher-level protocol. Here is an example of an XHTML document. In this example, the XML declaration is included.

<?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE html 
 PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>Virtual Library</title>
  </head>
  <body>
    <p>Moved to <a href="http://example.org/">example.org</a>.</p>
 </body>
</html>

Please refer to the W3 standards for XML.

This is the XML optional preamble.

version="1.0" means that this is the XML standard this file conforms to
encoding="utf-8" means that the file is encoded using the UTF-8 Unicode encoding

The encoding declaration identifies which encoding is used to represent the characters in the document.

More on the XML Declaration here: http://msdn.microsoft.com/en-us/library/ms256048.aspx

Related questions
                            
                                Simple conversion between java.util.Date and XMLGregorianCalendar
                            
                                XML Document to String
                            
                                How to add text to request body in RestSharp
                            
                                PHP: How to handle <![CDATA[ with SimpleXMLElement?
                            
                                Android: textColor of disabled button in selector not showing?
                            
                                Pandas read_xml() method test strategies
                            
                                Parse XML using JavaScript [duplicate]
                            
                                Android Design Library - Floating Action Button Padding/Margin Issues
                            
                                Using a custom typeface in Android
                            
                                How do you embed binary data in XML?
                            
                                javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"Group")
                            
                                Vim indent xml file
                            
                                Using copy-of with document() to add SVGs to XHTML output
                            
                                Validating with an XML schema in Python
                            
                                how to read all files inside particular folder
                            
                                How to check if a file exists in a folder?
                            
                                Is XSLT worth it? [closed]
                            
                                writing some characters like '<' in an xml file
                            
                                How to activate "Share" button in android app?
                            
                                how to use XPath with XDocument?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Meaning of - <?xml version="1.0" encoding="utf-8"?>

Tags:

character-encoding

xml

xml-declaration

xml-encoding

People also ask

Recent Activity

Donate For Us