Home
Developing
Develop stream applications with the Streams Processing Language (SPL), Java, and Python.
Writing stream applications
Teracloud® Streams provides features and functions to help you write your applications to fulfill your business needs.
Developing applications with user-defined parallelism
The @parallel annotation allows developers to easily take advantage of data-parallelism. In the streaming context, data-parallelism means replicating copies of operators, and splitting streams so that different tuples go to each set of replicas. The process of replicating operators, and creating all of the new streams to connect them, is called the parallel transformation. When developers add @parallel to a primitive or composite operator invocation, Teracloud Streams will perform the parallel transformation at submission time. The goal is to improve overall application throughput by executing the replicas in parallel.

Welcome
Learn about the core capabilities of Teracloud® Streams, its architecture, and key concepts.
Installing
Use this information to install, upgrade, and uninstall the Teracloud® Streams product.
Configuring
Create a basic or an enterprise domain which is a single point for configuring and managing common resources, security, and instances.
Administering
Administer the product by using the Teracloud® Streams graphical user interface, APIs, or the streamtool command-line interface.
Developing
Develop stream applications with the Streams Processing Language (SPL), Java, and Python.
- Development concepts
  Development of stream applications consists of several components such as operators, streams, tuples, Streams Processing Language, toolkits, and more.
- Writing stream applications
  Teracloud® Streams provides features and functions to help you write your applications to fulfill your business needs.
  - Streams Processing Language
    Streams Processing Language (SPL) is a distributed data flow composition language. SPL has primitive types, program structures, and definitions that are tailored for streaming data.
  - Developing a simple application
    Companies build stream applications when they need to extract information from large data sets so that they can analyze only the information that is relevant to them.
  - Developing another simple application
    In addition to filtering large data sets, you can reduce the volume of data by reducing the size of the tuples from the input stream.
  - Best practices for developing applications
    Use these tips to write operators that perform effectively in their application, and in other applications.
  - Developing applications with user-defined parallelism
    The @parallel annotation allows developers to easily take advantage of data-parallelism. In the streaming context, data-parallelism means replicating copies of operators, and splitting streams so that different tuples go to each set of replicas. The process of replicating operators, and creating all of the new streams to connect them, is called the parallel transformation. When developers add @parallel to a primitive or composite operator invocation, Teracloud Streams will perform the parallel transformation at submission time. The goal is to improve overall application throughput by executing the replicas in parallel.
    - User-defined parallelism
      User-defined parallelism, available through the @parallel annotation, allows you to easily take advantage of data-parallelism in your Teracloud® Streams applications.
    - Setting the degree of parallelism
      You can specify the number of channels for parallel regions within an application or as a submission time value.
    - Changing the degree of parallelism at run time
      You can change the width of a parallel region while the job is running.
    - Parallel region restrictions
      While the @parallel annotation can be applied to most operators, there are some limitations. The following conditions are not allowed in parallel regions, and the compiler will generate an error if it encounters them.
    - Parallel transformations and fusion
      Parallel regions are expanded when you submit the job or when you add channels to the parallel region in a running job. The expansion process is called the parallel transformation, which transforms the logical version of the application that is produced by the compiler into the physical version of the application.
  - Developing applications with consistent regions
    Because of business requirements, some applications require that all tuples in an application are processed at least once. You can use a consistent region in your stream applications to avoid data loss due to software or hardware failure and meet your requirements for at-least-once processing.
  - Developing applications with views
    To make streaming data available to external programs, your SPL application must create at least one view. A view defines the set of attributes that can appear in a specific viewable data stream.
  - Developing applications with event time
    Event time is a simple model which supports streams processing where time is not derived from the system time of the machine Teracloud® Streams is running on, but from a time value associated with each tuple. In a graph enabled for event time, tuples have an attribute which holds their time value. The time value for each tuple enables operations such as grouping tuples with a time value falling within specified time intervals, and running aggregate calculations on the group.
  - Logging and tracing options for SPL application developers
    SPL application developers can write messages to product log and trace files. Log and trace files help administrators, instance owners, application developers, and the support team to identify, diagnose, and resolve problems with the Teracloud® Streams and applications.
- Enabling Streams data exchange
  Teracloud® Streams provides a data exchange REST API for inserting and retrieving Streams job data. You can add one or more endpoint operators to your Streams application to enable Streams data exchange. Providing a standard REST interface for Streams data exchange eases the integration of Streams data with other data services and externally hosted applications.
- Compiling stream applications
  The Streams Processing Language (SPL) Compiler is called sc and is included with the Teracloud® Streams installation package. You compile SPL to create files to run on your Teracloud Streams instance.
- Performance improvements for stream applications
  You can improve the processing of your stream application by following these guidelines and tips.
- Debugging stream applications
  The Streams Processing Language (SPL) Streams Debugger (sdb) provides a command line-based interactive debugger for debugging stream applications.
- Developing toolkits and operators
  You can create custom toolkits and operators for your applications.
Troubleshooting
Resolve problems with Teracloud® Streams using the troubleshooting tools provided with the product as well as the resources offered by Teracloud Support.
Reference
Find details on the SPL language, toolkits, APIs, commands, and more.
Glossary
Use this glossary to find terms and definitions for Teracloud® Streams.

Developing stream applications with user-defined parallelism

The @parallel annotation allows developers to easily take advantage of data-parallelism. In the streaming context, data-parallelism means replicating copies of operators, and splitting streams so that different tuples go to each set of replicas. The process of replicating operators, and creating all of the new streams to connect them, is called the parallel transformation. When developers add @parallel to a primitive or composite operator invocation, Teracloud Streams will perform the parallel transformation at submission time. The goal is to improve overall application throughput by executing the replicas in parallel.