Character encoding of XML

If the input is contained in an ustring value, then it is already in Unicode and internal encoding is ignored if it is present. If the input is contained in an rstring or blob value, it is passed to the XMLParse operator as raw bytes. If present, the byte order mark (BOM) must match the internal encoding. If present, the internal encoding must match the actual bytes. It is an error to have parts of different XML documents with different encoding within the same input tuple. If there is no internal encoding or BOM, UTF-8 is assumed. If the input is contained in an xml value, the internal encoding is already correct.

On output all values from XML attributes or text are extracted as rstring values that contain UTF-8 encoded strings.