Teracloud® Streams features and architecture
Teracloud® Streams consists of a programming language, an API, and a runtime system that can run the applications on a single or distributed set of resources.
- Parallel and high performance streams processing software platform that can scale over a range of hardware environments
- Automated deployment of stream applications on configured hardware
- Incremental deployment without restarting to extend stream applications
- Secure and auditable run time environment
The Teracloud® Streams architecture represents a significant change in computing system organization and capability. Teracloud® Streams provides a runtime platform, programming model, and tools for applications that are required to process continuous data streams. The need for such applications arises in environments where information from one to many data streams can be used to alert humans or other systems, or to populate knowledge bases for later queries.
- The environment must process many data streams at high rates.
- Complex processing of the data streams is required.
- Low latency is needed when processing the data streams.
Teracloud® Streams offers the Streams Processing Language (SPL) interface for users to operate on data streams. SPL provides a language and runtime framework to support stream applications. Users can create applications without needing to understand the lower-level stream-specific operations. SPL provides numerous operators, the ability to import data from outside Teracloud® Streams and export results outside the system, and a facility to extend the underlying system with user-defined operators. Many of the SPL built-in operators provide powerful relational functions such as Join and Aggregate.
Users can also develop stream applications in other supported languages, such as C++, Python, or Java™. Toolkits like the Java™ Application API (Topology Toolkit) enhance the creation of streaming applications for Teracloud® Streams in some of these programming languages.
Deploying stream applications results in the creation of a dataflow graph, which runs across the distributed run time environment. As new workloads are submitted, Teracloud® Streams determines where to best deploy the operators to meet the resource requirements of both newly submitted and already running specifications. Teracloud® Streams continuously monitors the state and utilization of its computing resources. When stream applications are running, they can be dynamically monitored across a distributed collection of resources by using the Streams Console and streamtool commands.
Results from the running applications can be made available to applications that are running external to Teracloud® Streams by using Sink operators or edge adapters. For example, an application might use a TCPSink operator to send its results to an external application that visualizes the results on a map. Alternatively, it might alert an administrator to unusual or interesting events. Teracloud® Streams also provides many edge adapters that can connect to external data sources for consuming or storing data.