Metrics access
Metrics are simple counters that are maintained at run time, which can be read from outside of a running job to monitor statistics.
Two types of metrics are provided by the Teracloud® Streams instance, namely system and custom metrics. System metrics are predefined and are maintained by the Teracloud® Streams instance. Custom metrics are created and maintained by the operators.
System metrics
System metrics come in two varieties: operator-level metrics and processing
element-level metrics. The Metric
class is used to represent a metric. Each metric
has a short name, a description, and a value of type int64. System metrics can be read, but cannot
be written.
Operator-level metrics are accessed through the getMetrics() member function
of the OperatorContext
class, which returns an object of type
OperatorMetrics
. The operator context object is accessible through the
getContext() member function of the operator.
Individual operator-level metrics
are available through the getInputPortMetric(port, metric) and getOutputPortMetric(port,
metric) methods of the OperatorMetrics
class,
where port
is the port index and metric
is
the name of the metric (of OperatorMetrics::InputPortMetricName
and OperatorMetrics::OutputPortMetricName
type).
The available metrics include:
Name | Scope | Description |
---|---|---|
nTuplesProcessed |
Per input port | The number of tuples that were processed. |
nTuplesDropped |
Per input port | The number of tuples that were dropped. |
nTuplesQueued |
Per input port | The number of tuples that are queued. |
nWindowPunctsProcessed |
Per input port | The number of window punctuations that were processed. |
nFinalPunctsProcessed |
Per input port | The number of final punctuations that were processed. |
nWindowPunctsQueued |
Per input port | The number of window punctuations that are queued. |
nFinalPunctsQueued |
Per input port | The number of final punctuations that are queued. |
queueSize |
Per input port | The size of the queue for a threaded port, or 0 if the port
is not threaded. |
maxItemsQueued |
Per input port | The largest number of items queued for a threaded port, or 0
if the port is not threaded. |
recentMaxItemsQueued |
Per input port | The recent largest number of items queued for a threaded port, where the
number reported is the largest number in the current (not yet complete) and previous (completed)
interval (given by recentMaxItemsQueuedInterval), or 0 if the port is not threaded. |
recentMaxItemsQueuedInterval |
Per input port | The interval in milliseconds used to determine the recent largest number of
items queued for a threaded port, or 0 if the port is not threaded. Currently each
interval is 5 minutes in duration. |
nEnqueueWaits |
Per input port | The number of waits due to a full queue for a threaded port, or
0 if the port is not threaded. |
nTuplesSubmitted |
Per output port | The number of tuples that were submitted by the operator. |
nWindowPunctsSubmitted |
Per output port | The number of window punctuations that were submitted by the operator. |
nFinalPunctsSubmitted |
Per output port | The number of final punctuation that were submitted by the operator. |
relativeOperatorCost |
Per operator | An integer value (1 - 100) representing the relative computational cost of the operator compared to other operators within the same PE. A value of 1 indicates negligible cost compared to the other operators. A value of 100 indicates that almost all of the observed processing time is taken by this operator. |
Processing element-level metrics are accessed through the getMetrics() member
function of the ProcessingElement
class, which returns
an object of type PEMetrics
. The processing element
object is accessible through the getPE() member
function of the operator.
Individual processing element-level
metrics are available through the getInputPortMetric(port,
metric) and getOutputPortMetric(port, metric) methods
of the PEMetrics
class, where port
is
the port index and metric
is the name of the metric
(of PEMetrics::InputPortMetricName
and PEMetrics::OutputPortMetricName
type).
The available metrics include:
Name | Scope | Description |
---|---|---|
nTuplesProcessed |
Per input port | The number of tuples that were processed. |
nWindowPunctsProcessed |
Per input port | The number of window punctuations that were processed. |
nFinalPunctsProcessed |
Per input port | The number of final punctuations that were processed. |
nTupleBytesProcessed |
Per input port | The number of tuple bytes that were processed. |
nTuplesSubmitted |
Per output port | The number of tuples that were submitted. |
nWindowPunctsSubmitted |
Per output port | The number of window punctuations that were submitted. |
nFinalPunctsSubmitted |
Per output port | The number of final punctuations that were submitted. |
nTupleBytesSubmitted |
Per output port | The number of tuple bytes that were submitted. |
nBrokenConnections |
Per output port | The number of broken connections that were detected. |
nRequiredConnecting |
Per output port | The number of required connections that are currently connecting. |
nOptionalConnecting |
Per output port | The number of optional connections that are currently connecting. |
nTuplesTransmitted |
Per output port | The total number of tuples that were transmitted to all connected processing elements. |
nTupleBytesTransmitted |
Per output port | The total number of tuple bytes that were transmitted to all connected processing elements. |
nConnections |
Per output port | The number of processing elements that are connected to this output port. |
nCPUMilliseconds |
Per output port | The amount of time in milliseconds for which the CPU was used. |
nMemoryConsumption |
Per output port | The amount of memory in KB that was consumed by processing elements |
nResidentMemory |
Per output port | The amount of resident memory in KB that was consumed by processing elements |
In addition to the processing element-level metrics described in Table 2, processing element-level connection metrics are provided by the system. The available metrics include:
Name | Scope | Description |
---|---|---|
congestionFactor |
Per connection on each PE output port | An integer value (0 - 100) that represents the relative congestion for the connection. 0 means no congestion; 100 means that the connection is extremely congested. |
nTuplesFilteredOut |
Per connection on each PE output port | The number of tuples that failed to meet the filter criteria. |
Each processing element output port can be
connected to zero or more input ports on other processing elements.
The connection information can be accessed by using ProcessingElement::getOutputPortConnections(),
which returns a list of information about each connection, including
the congestion factor for this connection and the nTuplesFilteredOut
metric
for the connection. From SPL, the information can be returned by using
the native function spl.utility::getPEOutputPortConnections().
If a processing element restarts, all system metrics are reset to their initial values.
In addition to the processing element-level metrics described in tables 1-3, resource-level metrics are provided by the system. The available metrics include:
Name | Scope | Description |
---|---|---|
CpuCapacity |
Per resource | The number of CPU processors per resource. |
CpuSpeed |
Per resource | The amount of cycles that a CPU can perform per second. |
CpuLoadAverage |
Per resource | The average system load over the past minute. This is the system reported load average multiplied by 100. |
Custom metrics
Custom metrics for an operator can be automatically
instantiated by the Teracloud® Streams
instance before the operator runs by listing them in the operator model, or created dynamically by the
operator at run time. When listed in the operator model, the custom metric is instantiated
automatically by the Teracloud® Streams
instance if the optional dynamic
attribute of the metricType
element in
the operator model is omitted, or set to false. Custom metrics created dynamically by the operator
at run time can also be listed in the operator model, by setting the optional
dynamic
attribute of the metricType
element to true. Listing these
dynamically created custom metrics in the operator model allows them to be included in the
documentation generated for the operator by spl-make-doc.
The
names of all the custom metrics can be found by calling the getCustomMetricNames() member
function of the OperatorMetric
class. The custom
metrics can be accessed at run time by the getCustomMetricByName(name) member
function of the OperatorMetric
class. More metrics
can be created explicitly at run time, by using the createCustomMetric(name,
description, kind) member function. For both functions,
the returned metric objects of type Metric
are owned
by the Teracloud® Streams
instance,
but the operator can update their value freely. The value of the metric
can be updated by the use of setValue(newValue) and incrementValue(incValue) functions.
If a processing element restarts, custom metrics are reset to their initial value unless they are explicitly reset by the operator from a previous checkpoint.
The following example shows the runtime APIs for creating and updating custom metrics.
// member variable in operator
Metric * numInterestingEvents_;
...
// create the metric
OperatorMetrics & opm = getContext().getMetrics();
numInterestingEvents_ = & opm.createCustomMetric("nInterestingEvents",
"Number of very interesting events seen so far", Metric::Counter);
...
// update the metric
numInterestingEvents_->setValue(numEvents);
// increment the value
numInterestingEvents_->incrementValue(10);