Operator JSONToAvro
This operator converts JSON strings into binary Avro messages.
If an input or output message attribute is not found or has an incompatible type, the operator will fail. If an invalid JSON string is found in the input, the operator will fail if parameter ignoreParsingError is false.
If parameter embedAvroSchema is false, the operator passes window punctuation marker transparently to the output port. If parameter embedAvroSchema is true, the operator generates window punctuation markers.
This operator must not be used inside a consistent region.
Summary
- Ports
- This operator has 1 input port and 1 output port.
- Windowing
- This operator does not accept any windowing configurations.
- Parameters
- This operator supports 9 parameters.
Required: avroMessageSchemaFile
Optional: bytesPerMessage, embedAvroSchema, ignoreParsingError, inputJsonMessage, outputAvroMessage, submitOnPunct, timePerMessage, tuplesPerMessage
- Metrics
- This operator does not report any metrics.
Properties
- Implementation
- Java
- Ports (0)
-
Port that ingests JSON records.
- Properties
-
- Optional: false
- ControlPort: false
- WindowingMode: NonWindowed
- WindowPunctuationInputMode: Oblivious
- Assignments
- Java operators do not support output assignments.
- Ports (0)
-
Port that produces Avro records.
- Properties
-
- Optional: false
- WindowPunctuationOutputMode: Generating
Required: avroMessageSchemaFile
Optional: bytesPerMessage, embedAvroSchema, ignoreParsingError, inputJsonMessage, outputAvroMessage, submitOnPunct, timePerMessage, tuplesPerMessage
- avroMessageSchemaFile
-
File that contains the Avro schema to serialize the Avro message(s).
- Properties
-
- Type: rstring
- Cardinality: 1
- Optional: false
- bytesPerMessage
-
This parameter controls the minimum size in bytes that the Avro message block should be before it is submitted to the output port. Default value is 0l (disabled). Only valid if Avro schema is embedded in the output.
- Properties
-
- Type: int64
- Cardinality: 1
- Optional: true
- embedAvroSchema
-
Embed the schema in the generated Avro message. When generating Avro messages that must be persisted to a file system, the schema is expected to be included in the file. If this parameter is set to true, incoming JSON tuples are batched and a large binary object that contains the Avro schema and 1 or more messages is generated. Also, you must specify one of the parameters (submitOnPunct, bytesPerMessage, tuplesPerMessage, timePerMessage) that controls when Avro message block is submitted to the output port.After submitting the Avro message to the output port, a punctuation is generated so that the receiving operator can potentially create a new file.
- Properties
-
- Type: boolean
- Cardinality: 1
- Optional: true
- ignoreParsingError
-
Ignore any JSON or Avro parsing errors. When set to true, errors that occur when parsing the incoming JSON tuple or constructing the Avro tuple(s) will be ignored and the incoming tuple(s) will be skipped. Default is false.
- Properties
-
- Type: boolean
- Cardinality: 1
- Optional: true
- inputJsonMessage
-
The input stream attribute which contains the input JSON message string. This attribute must be of rstring or ustring type. Default is the sole input attribute when the schema has one attribute otherwise jsonMessage.
- Properties
-
- Type: rstring
- Cardinality: 1
- Optional: true
- outputAvroMessage
-
The ouput stream attribute which contains the output Avro message(s). This attribute must be of type blob. Default is the sole output attribute when the schema has one attribute otherwise avroMessage.
- Properties
-
- Type: rstring
- Cardinality: 1
- Optional: true
- submitOnPunct
-
When set to true, the operator will submit the block of Avro messages what was built and generate a punctuation so that the receiving operator can potentially create a new file. Default is false. Only valid if Avro schema is embedded in the output.
- Properties
-
- Type: boolean
- Cardinality: 1
- Optional: true
- timePerMessage
-
This parameter controls the maximum time in seconds before the Avro message block is submitted to the output port. Default value is 0l (disabled). Only valid if Avro schema is embedded in the output.
- Properties
-
- Type: int64
- Cardinality: 1
- Optional: true
- tuplesPerMessage
-
This parameter controls the minimum number of tuples that the Avro message block should contain before it is submitted to the output port. Default is 0l (disabled). Only valid if Avro schema is embedded in the output.
- Properties
-
- Type: int64
- Cardinality: 1
- Optional: true
- Operator class library