SPL character encoding

SPL uses a platform-independent character encoding for serializing and deserializing data.

This encoding is used for the FileSource, TCPSource, and UDPSource, FileSink, TCPSink, and UDPSink operators. This encoding is also used for the character serialization and deserialization function that is provided for the SPL C++ types.

If an encoding parameter is present on these operators, the encoding is done after serialization and before deserialization.

The following table describes how each SPL type is encoded in character format.
Table 1. Description of SPL types in character format

The table has three columns that are called SPL type, Character encoded value, and Example result of encoding.

SPL type Character encoded value Example result of encoded value
integer types decimal value without any suffixes 123
boolean true or false false
float and decimal types decimal value at the maximum precision for the type

scientific notation is supported
-10.34
1.24E+50
complex (real value, imaginary value) (1.0, 2.0)
rstring, bounded rstring, ustring SPL string literal with no suffix

rstring values are assumed to be UTF-8
"A long string with a\n newline in it."
timestamp (seconds, nanoseconds, machine ID) (500, 1000, 0)
blob pairs of hexadecimal characters, one for each byte of the blob 5A30BF94
list, bounded list [comma-separated value] with values encoded by using the character encoding for the type [0, 100, -40]
set, bounded set {comma-separated value} with values encoded by using the character encoding for the type {"a", "b", "c"}
map, bounded map {comma-separated list of key-value pairs} with values encoded by using the character encoding for the type {5:"hi", 6:"ho"}
tuple types {comma-separated list of attribute = value} with values encoded by using the character encoding for the type {x="abc", y=2}
optional types as for SPL type T for optional type optional<T> when there is a data value, otherwise the value is set to JSON null. null
enumeration Textual representation of the enumeration For an enum {a, b, c}, the value might be: a
XML SPL XML literal with escaped values and the suffix x "<a>hi</a>"x