Filtering large data sets
In this example, the stream application needs to filter the stock transaction data for IBM transaction records. You use the Filter operator to extract relevant information from potentially large volumes of data. As shown, the input for the Filter operator is all the transactions; the output is only the IBM transactions.

The SPL code for the Filter operator is shown.
To read the code, you say that the output stream is produced by operating
on the input stream. In this case, you say that IBMTransactions
is
produced by filtering AllTransactions
.
stream<TransactionRecord> IBMTransactions = Filter(AllTransactions) {
param
filter : ticker == "IBM";
}
In general, the Filter operator receives tuples
from an input stream and submits a tuple to the output stream only
if the tuple satisfies the criteria that are specified by the filter
parameter.
In this example, the Filter operator performs the following steps:
- Receives a tuple from the input stream (
AllTransactions
). - If the value of the
ticker
attribute isIBM
, it submits the tuple to the output stream (IBMTransactions
). - Repeats Steps 1 to 2 until all the tuples from the input stream are processed.
The Filter operator requires that the type of
the output stream is the same as the type of the input stream. The
type of the output stream is specified by the tupleType
in
the "stream<tupleType> OutputStream =
Filter(InputStream)
" declaration. In
this example, the type of the output and input streams is TransactionRecord
.