Compile-time APIs

You can write a primitive operator to optionally specialize at compile time to optimize for statically known information.

To specialize, write the operator in mixed-mode (MM), which is a mix of Perl and the native language, such as C++. The Perl code runs at compile time, and generates pure native-language code, which runs at run time. Library developers for SPL need a mechanism that supports flexibility, performance, and reuse of legacy code. To achieve flexibility and performance, you need a code generator framework, where compile-time customization provides flexibility without impacting runtime performance. And to achieve performance and reuse of legacy code, the generated code must be native, since a lot of code exists in native languages, and native languages provide more traditional imperative programming features than SPL aspires to support. You can choose to forgo the flexibility of mixed-mode by authoring primitive operators without Perl, thus avoiding the complexity of code generation.

Each primitive operator is described by an operator model, which specifies what constitutes a legal operator invocation. For example, the operator model specifies the number of ports and what kinds of streams they support, and specifies the names and expression modes for parameters. As another example, the operator model specifies whether the operator modifies tuple attributes, and if it does not, the compiler checks that parameter expressions do not call functions that can modify tuple attributes.

Figure 1. Build process for primitive operators.

This figure is described in the surrounding text.

The figure shows how primitive operators are compiled. When the SPL compiler encounters an operator invocation, it uses a lookup path to find a definition of either a composite operator or a primitive operator. Suppose that it finds a primitive operator. The compiler first consults the operator model, which consists of XML, typically generated by wizards in the IDE. If the operator invocation violates the model, the compiler reports error messages to the user. Otherwise, it combines the operator model with the information from the operator invocation into an operator context, which is a Perl object that is supplied to the Perl code generator. The primitive operator is written as a mix of Perl snippets (the code generator) and native code (code to be generated), hence the name mixed-mode. The Perl code is embedded in the native code by using ASP-style tags (<% . . . %> or <%= . . . %>), similar to how PHP or JSP code is embedded in HTML code using tags. At code generation time, the Perl code uses a Perl API to access the operator context. In the end, both the SPL code and the primitive operators turn into native code, which gets compiled into object files to be deployed as an application instance.

Tip: When you write primitive operators with mixed-mode, keep in mind that parameter expressions might be expensive or cause side effects. Parameters with side effects are bad practice, but not prohibited. Program your code generator defensively by maintaining a predictable execution order, and documenting that to the user so they know what to expect when your operator is invoked.