Tuple copying

The SPL compiler uses port mutability settings to improve the performance of an application by safely eliminating unnecessary tuple copies when tuples are sent from one operator to another operator.

Tuple copy elimination occurs when the Teracloud® Streams instance is able to pass a memory reference instead of a copy to a downstream operator. This type of optimization applies only to operators that reside in the same processing element (PE), for example when operators are fused together.

In addition to port mutability settings, the SPL compiler uses the number of stream connections that are associated to the output port of an operator (called port fan-out) to determine whether to provide a copy of the tuple to downstream operators. When the port fan-out is greater than one (for example, more than one operator consumes the operator output stream), the SPL compiler uses the port mutability settings of the downstream operators to determine whether to provide copies of the tuples.

Table 1 outlines when tuples are copied based on port mutability settings and the port fan-out. The SPL compiler and run time can avoid tuple copying that is based on port mutability configurations when a tuple is transmitted from the output port of an operator (Operator 1) to the input port of another operator (Operator 2). This information is provided for understanding the performance implications of port mutability under fused scenarios, and does not have any impact on safety. The SPL compiler ensures that downstream operators do not change values that belong to upstream operators, unless they are no longer used.

Table 1. Port mutability settings and tuple copying

This table explains the port mutability configurations when a tuple is transmitted from the output port of one operator to the input port of another operator.

Output port configuration of Operator 1 Input port configuration of Operator 2 Tuple reference conversion Tuple copying
Mutating Mutating Non-constant to non-constant
  1. If the port fan-out is 1, no tuple copying occurs.
  2. If the port fan-out is more than 1, and if Operator 2 is not the last to consume the tuple results, tuple copying occurs.
  3. If the port fan-out is more than 1, and if Operator 2 is the last to consume the tuple results, no tuple copying occurs.
Mutating Non-mutating Non-constant to constant No tuple copying occurs
Non-mutating Mutating Constant to non-constant Tuple copying occurs
Non-mutating Non-mutating Constant to constant No tuple copying occurs

The following figure shows a sample graph that contains nine operators with different port mutability settings. The figure assumes that all operators are fused in the same processing element (PE). It also assumes that the process calls of the operator are made in the order of the operator index, that is O1 to O9. The character m indicates a mutating port. The character i indicates a non-mutating port. The C annotation in a stream connection shows when the Teracloud® Streams instance forces a tuple to be copied before it invokes the process function of the downstream operator.

Figure 1. Port mutability.

This figure contains nine operators that are represented by circles .

The operator circles indicate different port mutability settings to show the process calls of the operators that create automatic tuple copies.

In the example in this figure, the Teracloud® Streams instance forces only the tuple copy in two situations. The first copy occurs when a tuple is transmitted from operator O2 to operator O5. The copy is made because the output port of operator O2 is set as mutating and has a port fan-out of 4. Also, the input port of operator O5 is mutating and O5 is not the last operator to consume the tuple. By forcing the tuple copy, the Teracloud® Streams instance ensures that operators O7 and O9 receive a non-modified version of the tuple that is produced by operator O2. The second forced copy is made when operator O5 sends a tuple to operator O6 because the process function of operator O6 expects to modify the tuple content, and it cannot do that with the constant reference sent by operator O5. Furthermore, operator O5 expects that the submitted tuple is unchanged after the submit call returns.