Status, performance, and connection information returned by streamtool capturestate
The output from the streamtool capturestate command is XML information about the status of the resources and jobs in a Teracloud® Streams instance. The output also contains metrics for the resources and jobs.
The schema for the XML output is defined in $STREAMS_INSTALL/schema/streamsInstanceState.xsd.
Instance data
Field | Select type | Description |
---|---|---|
id | all | The instance identifier. For more information, see Instance identifiers. |
state |
hosts=state |
The state of the resources in the instance. For more information, see Instance status values. |
requestTime | all | The time the command request was processed. The time is represented as the number of seconds since the epoch (January 1, 1970 00:00:00 UTC). |
<host> | hosts | The list of resources that are configured for the instance. |
<job> | jobs | The list of jobs that are running in the Teracloud® Streams instance. |
Resource and service data
If you specify the --select hosts command parameter, the instance contains a <host> element for each resource that is configured for the instance. The data for each resource is described in the following table.
Field | Select type | Description |
---|---|---|
id | all | The resource identifier. For more information, see Resource identifiers. |
state | state | The state of the resource in the instance. For more information, see Resource status values. |
schedulableState | state | Indicates whether the resource is available for scheduling application jobs. For more information, see Resource status values. |
isMetricsStale | metrics | A Boolean value that indicates that the metrics available for this resource was not retrieved on the most recent metrics collection interval. Data from a previous metrics collection interval is provided. |
<service> | all | The list of services that are running on the resource. |
<metric> | all | The list of metrics that are available for this resource. |
Each resource includes a <service> element for each instance service that is running on the resource. The service data for the resource is described in the following table.
Field | Select type | Description |
---|---|---|
name | all | The name of the service. Possible service names are: app; data; hc; sam; srm; view. |
state | state | The state of the service. For more information, see Teracloud Streams service status values. |
reasonCode | state | The reason code provides more
information about the current state of the service. For example:
|
Job, processing element, operator, and port data
If you specify the --select jobs command parameter, the instance contains a <job> element for each job that is running in the Teracloud® Streams instance. The data for each returned job is described in the following table.
Field | Select type | Description |
---|---|---|
id | all | The identifier of the job. |
name | all | The job name. |
applicationName | all | The application name that is associated with the job. |
submitTime | all | The time the job was submitted to the instance. This time is represented in the number of seconds since the epoch. |
user | all | The user that submitted the job. |
state | state | The state of the job. For more information, see Monitoring jobs. |
healthSummary | state | A summary of the overall health of the job.
The health summary is based on the health of the processing elements
(PEs) in the job. Possible values are:
|
<pe> | all | The set of PEs running in this job. |
Each <job> element contains a set of <pe> elements, which represent the processing elements that are running in the job. The data for each PE is described in the following table.
Field | Select type | Description |
---|---|---|
id | all | The identifier of the PE. |
host | all | The host that the PE is running on. |
processId | all | The process ID associated with this PE. |
state | state | The current state of the PE. For more information, see Monitoring processing elements. |
reasonCode | state | The reason code provides more information about the current state of the PE. For more information, see Monitoring processing elements. |
requiredConnections | state | Indicates the health of the required connections
for the PE. Possible values are:
|
optionalConnections | state | Indicates the health of the optional connections for the PE. Possible values
are:
|
healthSummary | state | A summary of the overall health of the PE. The
health summary is based on the PE's state and the state of the connections
within the PE. Possible values are:
|
isMetricsStale | metrics | A Boolean value that indicates that the metrics available for this PE were not retrieved on the most recent metrics collection interval. Data from a previous metrics collection interval is provided. The frequency of metric collections is managed by the hc.metricCollectionInterval instance configuration property. |
lastMetricCollection | metrics | The last metric collection period that the PE reported metrics to the domain controller service. This information is included if metrics were requested. |
<metric> | metrics | The list of metrics available for the PE. |
<operator> | all | The set of operators that is running in this PE. |
<inputPort> | all | The set of input ports for the PE. |
<outputPort> | all | The set of output ports for the PE. |
Each <pe> element contains a set of <operator> elements, which represent the operators that are running in the PE. The operator data is described in the following table.
Field | Select type | Description |
---|---|---|
name | all | The name of the operator. If the operator is parallelized, the channelIndex value is encoded in the name. |
logicalName | all | The name of the logical operator. If the operator is not parallelized, the logicalName is the same as the name. |
<parallelChannel> | all | The list of channels of the parallel regions that are used to route tuple data for this operator. The returned channel information is ordered with the innermost region information first and the outermost last. The list is empty if the operator is not parallelized. |
<metric> | metrics | The set of metrics available for the operator. |
<inputPort> | all | The input ports for the operator. |
<outputPort> | all | The output ports for the operator. |
The command returns a set of <parallelChannel> elements, which represent the channels of the parallel regions that are used to route tuple data for this operator. The data for the elements is described in the following table.
Field | Select type | Description |
---|---|---|
index | all | The channel index within the parallel region. |
logicalName | all | The logical name of the parallelized operator that introduces the region. |
The command returns a set of <inputPort> elements, which represent the input ports for the PE or operator. The data for each input port is described in the following table.
Field | Select type | Description |
---|---|---|
index | all | The index of the input port. For operators, the ports are numbered by the order that they appear in an input specification on an operator. |
name | all | The name of the port alias that is specified in the SPL source file, if it exists. If it does not exist, it is the unqualified local name that is used for the port on the operator invocation. |
<metric> | metrics | The set of metrics available for this input port. |
Field | Select type | Description |
---|---|---|
index | all | The index of the output port. For operators, the ports are numbered by the order that they appear in an output specification on an operator. |
streamName | all | The name of the stream that is associated with this output port. |
name | all | The name of the port alias that is specified in the SPL source file, if it exists. If it does not exist, it is the unqualified local name that is used for the port on the operator invocation. |
<metric> | metrics | The set of metrics available for this output port. |
<connection> | all | Connections from the PE output port to other PE input ports. |
Field | Select type | Description |
---|---|---|
inputPeId | all | The ID of the PE that is the target of this connection. |
inputPortIndex | all | The port for the input PE that is the target of this connection. |
state | all | The connection state. Possible values are:
|
required | all | Whether the connection is required for Teracloud®
Streams to start processing messages. Possible values are:
|
<metric> | metrics | The set of metrics available for this connection. |
Metrics data
Field | Select type | Description |
---|---|---|
name | metrics | The name of the metric. |
lastChangeObserved | metrics | The last time the metric was changed by the domain controller service. The time is represented in seconds since the epoch. |
userDefined | metrics | A value of true indicates that the metric was defined by an operator. If false, the metric is managed by Teracloud® Streams. |
<metricValue> | metrics | The value of the metric. Each metric value contains
the following information:
|
Metric name | Description |
---|---|
Parent element: <host> | |
cpuSpeed | The speed of the CPU as represented by the BogoMips computed by the Linux™ kernel if Teracloud® Streams is running on bare metal or on virtual systems. On Kubernetes all the system CPUs are visible but a pod can use only the limited (requested) number of CPUs. For example, a system has 56 CPUs but the pod has only two. The basis for the calculation is still 56. This means the displayed value as a measure for the "power of the host/resource" is too high. The real value of the resource in this example is 2 (number of CPUs for this pod)/56 (number of CPUs) of the displayed value. |
cpuUtilization | The CPU utilization for the past minute. |
loadAverage | The load average for the past minute. This average is the system-reported load average multiplied by 100. |
memoryTotal | The total memory (KB). |
memoryUtilization | The memory utilization (KB). |
networkSpeed | The network speed (MB/second). |
networkUtilization | The network utilization for the past minute. |
nProcessors | The number of processors for the resource. |
Parent element: <pe> | |
nCpuMilliseconds | CPU time that was used by the PE, in milliseconds (user and kernel). |
nMemoryConsumption | Memory consumption (resident, text, and data) that was used by the PE (KB). |
nResidentMemoryConsumption | Resident memory consumption that was used by the PE (KB). |
Parent element: <pe>/<inputPort> | |
nFinalPunctsProcessed | The number of final punctuations that were processed. |
nTupleBytesProcessed | The number of tuple bytes that were processed. |
nTuplesProcessed | The number of tuples that were processed. |
nWindowPunctsProcessed | The number of window punctuations that were processed. |
Parent element: <pe>/<outputPort> | |
nBrokenConnections | A count of previously established connections that were later detected as being broken. This value is an incrementing counter over the lifetime of a PE's current process. |
nConnections | The number of connections to this output port. |
nFinalPunctsSubmitted | The number of final punctuations that were submitted. |
nOptionalConnecting | The current number of optional connections that are not connected and are in the process of connecting to their receiver. |
nRequiredConnecting | The current number of required connections that are not connected and are in the process of connecting to their receiver. |
nTupleBytesSubmitted | The number of bytes that were submitted. |
nTupleBytesTransmitted | The total number of bytes that were transmitted to all connected PEs. |
nTuplesSubmitted | The number of tuples that were submitted. |
nTuplesTransmitted | The total number of tuples that were transmitted to all connected PEs. |
nWindowPunctsSubmitted | The number of window punctuations that were submitted. |
Parent element: <pe>/<outputPort>/<connection> | |
congestionFactor | An integer value (0 - 100) representing the relative congestion for the connection. A value of 0 indicates no congestion, and 100 indicates that the connection is extremely congested. |
nTuplesFilteredOut | The number of tuples that were not sent because they failed to meet the filter criteria. |
Parent element: <pe><operator> | |
relativeOperatorCost | An integer value (1 - 100) representing the relative computational cost of the operator compared to other operators within the same PE. A value of 1 indicates negligible cost compared to the other operators. A value of 100 indicates that almost all of the observed processing time is taken by this operator. |
Parent element: <operator>/<inputPort> | |
maxItemsQueued | The largest number of items queued for a threaded port, or 0
if the port is not threaded. |
nEnqueueWaits | The number of waits due to a full queue for a threaded port, or
0 if the port is not threaded. |
nFinalPunctsProcessed | The number of final punctuations that were processed. |
nFinalPunctsQueued | The number of final punctuations that are queued. |
nTuplesDropped | The number of tuples that were dropped. |
nTuplesProcessed | The number of tuples that were processed. |
nTuplesQueued | The number of tuples that are queued. |
nWindowPunctsProcessed | The number of window punctuations that were processed. |
nWindowPunctsQueued | The number of window punctuations that are queued. |
queueSize | The size of the queue for a threaded port, or 0 if the port is not threaded. |
recentMaxItemsQueued | The recent largest number of items queued for a threaded port, where the
number reported is the largest number in the current (not yet complete) and previous (completed)
interval (given by recentMaxItemsQueuedInterval), or 0 if the port is not threaded. |
recentMaxItemsQueuedInterval | The interval in milliseconds used to determine the recent largest number of
items queued for a threaded port, or 0 if the port is not threaded. Currently each
interval is 5 minutes in duration. |
Parent element: <operator>/<outputPort> | |
nFinalPunctsSubmitted | The number of final punctuations that were submitted. |
nTuplesSubmitted | The number of tuples that were submitted. |
nWindowPunctsSubmitted | The number of window punctuations that were submitted. |