Operators implemented in Python
SPL offers two approaches for implementing operators in Python, Python primitive operators and the PythonOp operator.
Python primitive operators are analogous to C++ operators. There is no analog to mixed-mode with Python. Although operators implemented in Python can be parameterized to a great degree, they do not have the full-flexibility of C++ operators because there is no mixed-mode available. When you implement operators in Python, you must use the Python Operator API to receive, inspect, and submit tuples. Invocations of the PythonOp operator (provided by the SPL Standard Toolkit) are a generalized call to Python. Similar to the Custom operator, PythonOp operators are useful for prototyping. Using the PythonOp operator does not require defining a model that specifies the syntactic and semantics properties of the operator. In both cases, there is an additional runtime cost to serializing and deserializing tuples across the Python Runtime boundary. This cost is moderate and has an impact in high throughput operators or in operators doing very little work relative to each tuple.
Both PythonOp operators and Python primitive operators use the Python Operator API. PythonOp operators, however, are not technically primitive operators. Rather, PythonOp operators provide a general way to call Python from SPL. Developers use the operator parameter module to specify the python module that contains the tuple processing logic. Similar to writing single-use Custom operators, developers do not have to provide an operator model for PythonOp operators. PythonOp operators have no restrictions on the number of input and output streams, or on the number of constant parameters. However, PythonOp operators are intended to be an easy way to prototype Python operators in SPL. The SPL compiler has no knowledge of how the operator is used because it does not have an operator model that is specialized for each invocation. Consequently, the SPL compiler cannot check at compile time to ensure that a particular invocation of the PythonOp operator is correct.
Python primitive operators are the preferred method for implementing primitive operators in Python because they are seamlessly integrated into SPL. Similar to C++ and Java primitive operators, Python primitive operators have a unique name. The SPL compiler can enforce compile-time checks of operator parameters, and on the number of input and output ports. Unlike the PythonOp operator that has a generic operator model, Python primitive operators use specialized operator models that describe such constraints.
Python primitive operators also provide better implementation encapsulation. A user of the operator does not need to know the name of the Python module for the operator implementation. This information is described directly in the operator model. SPL developers that use a Python primitive operator do not need to know that the operator they are invoking is implemented in Python.