Developing another simple application
In addition to filtering large data sets, you can reduce the volume of data by reducing the size of the tuples from the input stream.
About this task
Suppose you have a source file with tuples that contain many attributes, but
most of those attributes are not used by downstream operators. You may want to create a new
stream that contains only a few of the needed attributes. In this example, the
BigDataType
tuple type has 30 attributes, whereas the
SmallDataType
tuple type has only four attributes (ticker
,
date
, time
, price
).
type
BigDataType = rstring ticker, rstring date, rstring time, int32 gmtOffset,
rstring ttype, rstring exCntrbID, decimal64 price,
decimal64 volume, decimal64 vwap, rstring buyerID,
decimal64 bidprice, decimal64 bidsize, int32 numbuyers,
rstring sellerID, decimal64 askprice, decimal64 asksize,
int32 numsellers, rstring qualifiers, int32 seqno,
rstring exchtime, decimal64 blockTrd, decimal64 floorTrd,
decimal64 PEratio, decimal64 yield, decimal64 newprice,
decimal64 newvol, int32 newseqno, decimal64 bidimpvol,
decimal64 askimpcol, decimal64 impvol;
SmallDataType = rstring ticker, rstring date, rstring time, decimal64 price;
In this example, you create a stream application that reads data from a source file, reduces the size of the tuples, and then writes the reduced data to an output file.