Teracloud® Streams glossary

Use this glossary to find terms and definitions for Teracloud® Streams.

The following cross-references are used in this glossary:
  • See refers you from a term to a preferred synonym, or from an acronym or abbreviation to the defined full form.
  • See also refers you to a related or contrasting term.
@ A B C D E F G H I J L M N O P R S T V W

@

@autonomous
An SPL annotation that defines operators and streams as part of an autonomous region. The @autonomous annotation is associated with one or more operators that define the start of an autonomous region. By default, operators are implicitly defined in an autonomous region.
@catch
An SPL annotation that specifies that exceptions of a specified type that are thrown by an operator while processing tuples are caught.
@consistent
An SPL annotation that defines operators and streams as part of a consistent region. The @consistent annotation is associated with one or more operators that define the start of a consistent region. Consistent regions can merge if their subgraphs intersect.
@eventTime
An SPL annotation that defines that all the output ports of operators where the annotation is in effect emit a stream with event-time values and watermarks.
@parallel
An SPL annotation that specifies that parallel processing of streaming data in specific regions of your Teracloud® Streams application is used.
@view
An SPL annotation that defines that a view is to be created when the streams application is run.

A

accelerator
A set of software components that help developers implement big data solutions and that help data scientists iterate and refine data analyses that produce more meaningful results.
access control list (ACL)
Teracloud® Streams uses access control lists to enforce security. An ACL is composed of the type of domain or instance object to secure and the actions that a group, user, or role is authorized to perform against the object.
adapter
An intermediary software component that allows two other software components to communicate with one another.
analytics
The science of studying data in order to find meaningful patterns in the data and draw conclusions based on those patterns.
annotation
A modification to the invocation of an operator that alters the behavior of stream applications. Annotation names are prefixed with the at symbol (@).
application
One or more computer programs or software components that provide a function in direct support of a specific business process or processes. See stream application.
application bundle file
A file that contains all toolkit artifacts that are needed to run a stream application. You submit the application bundle file (.sab) to the Teracloud® Streams instance. See also stream application.
application configuration object
An object that is used to store information that your Teracloud® Streams environment or applications might need, for example to connect to a device management service.
application description language file (ADL file)
A configuration file that is created when a stream application is compiled. See also application bundle file and stream application.
application graph
See data flow graph.
application set project
A project where you group a set of related SPL projects together and build, graph, and deploy the related projects as a single unit.
application scope
An attribute of a stream application that limits which streams can connect to each other by using import and export operators. An application can only connect to another application if they share an application scope.
attribute
A named data value in a tuple. Each attribute has a specific data type.
autonomous region
A region where operators are either implicitly or explicitly annotated with the @autonomous annotation. The SPL compiler calculates the extent of downstream operators that are contained in the region. During run time, operators in the region process tuples as they arrive. By default, operators are in an autonomous region.

B

basic domain
A basic domain has a single resource and user. It uses Apache ZooKeeper for managing and storing configuration information. See also domain and enterprise domain.
bootstrap properties
The bootstrap properties contain information necessary to start components of Teracloud® Streams, such as the embedded ZooKeeper.
build configuration
The metadata that describes how an application is built, including compiler parameters and dependencies. See also launch configuration.

C

cluster
A logical grouping of resources in a network. See also resource.
co-location group
A collection of operators in an application that are grouped together in a processing element (partition co-location) or that are grouped together on a host.
code generation template
The mixed-mode source file that is used by a generic operator to generate specific customizations. See also generic operator.
collection
A kind of data type. The three kinds of collections are a list, set, or map.
collection resource
A resource that provides access to information about a set of artifacts of the same type, such as jobs, operators, or processing elements.
composite operator
An operator that is implemented in the Streams Processing Language (SPL) that encapsulates a subgraph of a data flow graph that can be parameterized to make it reusable in multiple stream applications. See also operator, data flow graph, subgraph, and main composite operator.
config
A directive that describes how the compiler builds an operator invocation or how the runtime system executes the operator. See also operator and operator invocation.
consistent region
A region that is defined by the @consistent SPL annotation on a starting operator. The SPL compiler calculates the extent of downstream operators that are contained in the region. During run time, operators in the region process all tuples at least once.
consistent state
A point in time where all tuples for all streams in a consistent region have been fully processed by the operators in the consistent region.
Custom operator
An operator written in the Streams Processing Language (SPL). Contrast with primitive operator. See also operator and primitive operator.

D

data flow
The transfer of data between constants, variables, and files by running statements, procedures, modules, or programs.
data flow graph
A representation of the set of operators and the streams that connect them within a stream application. See also operator and stream.
data mining
The process of collecting critical business information from a data warehouse, correlating the information and uncovering associations, patterns, and trends.
data parallelism
A situation in which parallel tasks perform the same computation on different sets of data
data source
The source of data itself, such as a database or XML file, and the connection information necessary for accessing the data.
default initialization
Representation of an initial value for a variable or attribute that has an associated data value. This representation is based on the type of the attribute or variable. For example, 0 for an integer, or an empty string for an rstring. The default value for an optional type is null.
deploy
To place files into an operational environment. For example, when you submit a application bundle file, the Teracloud® Streams instance automatically deploys the file to the resource where the job runs.
deserialize
To reconstruct an object from serialized data.
distributed execution environment
The Teracloud® Streams instance runtime system that executes a stream application. See also stand-alone execution environment.
distributed file system
A file system that is composed of files or directories that physically exist on more than one computer in a communications network.
domain
A Teracloud® Streams domain is a logical grouping of resources in a network for common management and administration. To use Teracloud® Streams, you must create at least one domain. See also basic domain and enterprise domain.
domain cache
The domain cache contains a mapping between domains and their ZooKeeper connection information. The domain cache is used when you run streamtool commands. There is one domain cache per user, and it is stored in the .streams/var directory.
domain controller
A domain controller service runs on every resource in the domain and manages all of the other services on that resource.
domain host installation package
A subset of the Teracloud® Streams installation package, which you can use to install Teracloud® Streams with a smaller footprint on hosts in a domain. At least one host must install the larger Teracloud® Streams product image. See also installation package.

E

Edge device
A functional unit such as a router or gateway that is deployed at the border of an administrative domain. An edge device controls traffic through one point only.
engine

An engine is a program that provides essential runtime services—such as resource management, scheduling, and communication—to support and coordinate the execution of stream applications.

enterprise domain
An enterprise domain can have multiple resources and users. This type of domain is typically used for production environments. You can configure high availability to ensure that Teracloud® Streams can continue to run even if resources fail or are not available. See also domain and basic domain.
embedded ZooKeeper
An embedded version of Apache ZooKeeper is installed with the product and managed by Teracloud® Streams. You can use this embedded ZooKeeper for managing and storing configuration information in basic domains.
event time
Event time is a simple model which supports streams processing where time is not derived from the system time of the machine Teracloud® Streams is running on, but from a time value associated with each tuple.
ex-location group
A collection of operators that must not be grouped together into processing elements (partition ex-location) or on the same host (host ex-location).
exported stream
A sequence of tuples that is output from an operator, making it available for other operators and stream applications to import. The stream can only be imported by applications that are running in the same streaming middleware instance. See also stream.
expression mode
In a composite operator parameter declaration, the type of an accepted parameter value. The expression mode can be an attribute, an expression, or a constant.
external ZooKeeper
An external Apache ZooKeeper server is required for an enterprise domain. The ZooKeeper server or collection of servers (also known as an ensemble) must be installed and configured before you create the enterprise domain. See also enterprise domain and embedded ZooKeeper.

F

failover
An automatic operation that switches to a standby service in the event of a software, hardware, or network interruption.
feed
A data format that contains periodically updated content that is available to multiple users, applications, or both.
fuse
To combine multiple operator invocations in a data flow graph into the same partition and thus into the same processing element.
Fusion scheme
The specification of how the operators in an application are fused into processing elements.

G

generic operator
A primitive operator that contains mixed-mode code, both C++ code and Perl code. The Perl code generates C++ code that augments the other C++ code and provides specific customization for that operator invocation. See also non-generic operator and primitive operator.

H

high availability
The process of monitoring resources and applications for errors and failing over to standby services in order to maintain availability of those resources for consumers. This concept is sometimes also known as continuous operations.
host
A host is a computer that Teracloud® Streams uses as a resource for running domain services, instance services, and stream applications. See Teracloud® Streams resources.

I

Teracloud® Streams Console
A user interface that you can use to monitor and manage the resources, instances, and applications in a domain.
imported stream
A sequence of tuples that is imported by an operator. Imported streams are matched to exported streams. The match can be done by subscription (also known as properties) or by a stream name. The stream can only be imported from operators or applications that are running in the same streaming middleware instance. See also stream.
Teracloud® Streams instance
Each Teracloud® Streams instance operates as an autonomous unit and can be shared by multiple users. You can create multiple instances within a domain.
Teracloud® Streams resource
A resource that is allocated by the Teracloud® Streams resource manager. Teracloud® Streams resources are always hosts. See also resource.
installation package
An installable unit of a software product. Software product packages are separately installable units that can operate independently from other packages of that software product. See also domain host installation package.
instance
A specific occurrence of an object that belongs to a class.
instance graph
A graphical view of the stream applications that are running on a Teracloud® Streams instance. Color schemes communicate the health of the processing elements and streams as well as other metrics, such as data flow rates or the number of processed tuples.

J

job
An instance of a running stream application as defined in the application bundle file. See also stream application.
job group
A group of jobs that have the same authority or permissions.
job scheduler
A component that determines how operators are fused and where processing elements (PEs) are placed, and that deploys the PEs accordingly.

L

late tuples
Tuples are late tuples if their event times within the interval of the window pane arrive after the pane has triggered.
launch configuration
The metadata that describes how a stream application is launched, including a reference to the instance, the ADL file, and runtime parameters. See also build configuration.
logical application
A compiled stream application. See also physical application.

M

main composite operator
A composite operator that encapsulates the data flow graph, that is the root of that graph, that has no input or output ports, and that when compiled represents a stream application. See also composite operator and data flow graph.
mixed-mode application
An application that includes mixed-mode code, both Perl code and SPL code. The Perl code augments the existing SPL code. See also stream application.
mutability
The capability of modifying tuples on a port. Both input ports and output ports can be defined as mutating or non-mutating.

N

namespace
A logical container in which all the names are unique. The unique identifier for an artifact is composed of the namespace and the local name of the artifact.
native function
A function written in C++ or Java code and that can be invoked from SPL code.
node
A computer that is part of a clustered system. See host.
non-generic operator
A primitive operator implemented entirely in C++ code. See also generic operator and primitive operator.
null
A variable that is used when there is no data value for an artifact (attribute or variable).

O

operator
A program that processes tuples in an incoming stream and produces an output stream as a result. An operator can have any number of input ports and any number of output ports. See also tuple, composite operator and primitive operator.
operator instance
See operator invocation.
operator invocation
An instance of an operator that was defined for a specific context. See operator.
operator model
An XML document that describes the basic syntactic and semantic properties of a primitive operator. See also primitive operator.
optional type
A variable or attribute that might have no value associated with it because the value is unavailable or unknown.

P

parallelism
The state of a computer program in which parts of the program can be concurrently executed.
parallel region
A region that is defined by the @parallel SPL annotation on a primitive or composite operator. A parallel region allows for parallel processing of streaming data. For each channel in a parallel region,Teracloud® Streams replicates the invoked operator.
parallel transformation
The application of transformation rules for creating a physical application from a logical application in order to enable data parallelism in a stream application. See also logical application and physical application.
partition
(1) A set of operator invocations that are fused together into a processing element. See also operator invocation, fuse, and processing element.
(2) In a window, a logical set of tuples based on an expression. See also window.
permission
The ability to perform an action against a Teracloud® Streams object, such as a domain, instance, host, or job. Permissions are assigned by using an access control list.
physical application
An application that was submitted to the runtime system. See also logical application.
PE
See processing element.
placement scheme
The specification of how the PEs in an application are distributed among the hosts in the instance.
port
The point of connection of an operator to a stream. Input ports consume one or more streams, whereas output ports produce a stream.
primitive operator
An operator that is implemented in the C++ or Java language and that includes an operator model that describes the syntax and semantics of the operator.
process
An instance of a program running on a system and the resources that it uses.
A sequence of instructions that a computer can interpret and run without a user's intervention.
processing element (PE)
An operating system process that includes the operators and streams that are defined in a data flow graph or subgraph of a stream application.
program
A sequence of instructions that a computer can interpret and run without a user's intervention.
punctuation
A control signal within a stream that either creates boundaries within a stream of tuples (window punctuation marks) or identifies the end of a stream (final punctuation marks). See also window.

R

reachability graph
Any other operator that an operator reaches through all its outgoing stream connections, which means all the connections of all the output ports.
region
A subgraph of an SPL application where the operators and streams are related. Typically, operators are related by an annotation.
resource
A physical or logical entity that Teracloud® Streams uses to run services. See also Teracloud® Streams resource.
role
A collection of permissions that can be assigned to a user or group of users.

S

sink operator
An operator that sends information as a stream to an external system, such as a dashboard, web server, mail server, or a database.
source operator
An operator that fetches information from an external system, such as a sensor, messaging system, or a database, and presents that information as a stream.
splitter
A runtime component that exists on the output port of an operator and that sends tuples to different channels in the parallel region.
stand-alone execution environment
The platform that runs a stream application locally as an executable and does not require the Teracloud® Streams runtime system. See also distributed application.
standby
An idle service that is available to replace another service that is currently in use. See also high availability.
stateful
Of or pertaining to a system or process that tracks the state of interaction. Stateful means the computer or program tracks the state of interaction, usually by setting values in a storage field that is designated for that purpose.
stateless
Having no record of previous interactions. A stateless server processes requests based solely on information that is provided with the request itself, and not based on memory from earlier requests.
stream
A sequence of tuples. See also tuple.
stream application
An application that consists of a main composite operator with at least one primitive operator and possibly one or more composite or primitive operators, all of which process streams of data. See also main composite operator, composite operator, and primitive operator.
Streams Processing Language (SPL)
A programming language that is used to create stream applications. See stream application.
subgraph
A data flow graph for a composite operator that is reused by stream applications. See data flow graph.

T

tag
An identifier that is associated with one or more resources and helps group resources that have different physical characteristics or logical uses. Resources can have any number of tags.
toolkit
A collection of artifacts that are organized into a package. A toolkit includes one or more namespaces, which contain the functions, operators, and types that are packaged as part of the toolkit, all of which can then be reused in other applications.
trigger
A mechanism that detects an occurrence and can cause processing in response.
tuple
An individual piece of data in a stream that is represented as a set of attributes and data values. Typically, the data values in a tuple represent a single observation of data, such as a stock ticker quote or a temperature reading from an individual sensor.

V

view
The metadata that describes how the runtime system samples the tuples in a stream for visualization.

W

watermarks
Watermarks flow in a data stream and carry a time value. They provide a metric of event-time progress in the stream.
window
A logical container for a defined set of tuples that were received by an input port of an operator and that are typically maintained in memory.

Z

ZooKeeper connection string
One or more host and port pairs that identify the ZooKeeper servers for use with Teracloud® Streams.