Application directory files and structure
Learn about the different files and common directories that make up an SPL application. View the recommended structure so the SPL compiler and other Teracloud® Streams development tools work best.
An application directory is a file system directory that contains at least one main composite operator. The main composite SPL file may be located at the top level or within subdirectories. Subdirectories containing SPL files, native function, or operator models are considered namespace directories.
For example, consider the following directory tree that contains two SPL files where Hello.spl contains a main composite operator and Util.spl contains application-specific types and functions:
helloworld/
├── Hello.spl
└── my.namespace/
└── Util.spl
helloworld is an application directory and my.namespace is a namespace directory.
- An information file to provide application metadata details
- Organizational directories to organize data, configuration files, supporting libraries, and more
Application directory components
The following subsections detail the different files and subdirectories often found in the top level of application directories.
- Information file
- An information file (info.xml) is an optional, but
recommended XML file at the top level of an application directory which
defines the application name, description, version, and required Streams product version.Here is an example of an info.xml:
<?xml version="1.0" encoding="UTF-8"?> <toolkitInfoModel xmlns="http://www.ibm.com/xmlns/prod/streams/spl/toolkitInfo" xmlns:common="http://www.ibm.com/xmlns/prod/streams/spl/common"> <identity> <name>ApplicationName</name> <description>A description of the application</description> <version>1.0.0</version> <requiredProductVersion>7.2.0.0</requiredProductVersion> </identity> </toolkitInfoModel>
An info.xml can also contain any toolkit dependencies the application requires in the form of a list of toolkit names and a version or a range of versions. To specify a range of versions for the dependent toolkits, use brackets
[]
to represent an inclusive range and parentheses()
to represent an exclusive range.The following example showcases how to specify dependencies at a specific version and range of versions. The
1.0.1
version represents a dependency on the acme toolkit version 1.0.1 or later. The[5.0.0, 6.0.0)
version range represents a dependency on the kafka toolkit version 5.0.0 or later, but not including 6.0.0.<?xml version="1.0" encoding="UTF-8"?> <toolkitInfoModel xmlns="http://www.ibm.com/xmlns/prod/streams/spl/toolkitInfo" xmlns:common="http://www.ibm.com/xmlns/prod/streams/spl/common"> <identity> <name>AppWithDependencies</name> <description>This application requires several toolkits to build and run</description> <version>1.0.0</version> <requiredProductVersion>7.2.0.0</requiredProductVersion> </identity> <dependencies> <toolkit> <common:name>com.acme.toolkit</common:name> <common:version>1.0.1</common:version> </toolkit> <toolkit> <common:name>com.teracloud.streams.kafka</common:name> <common:version>[5.0.0 ,6.0.0)</common:version> </toolkit> </dependencies> </toolkitInfoModel>
Any dependencies an application has on the version of Teracloud® Streams, toolkits, or external libraries are resolved at application compilation time.
- Namespace directories
- Namespace directories are recommended for developers to group their
application logic and to avoid naming collisions.Note: Namespaces can traverse multiple directories, in which case their names are concatenated by the
.
character. For instance, com/example and com.example are both valid directory structures and result in the same namespace for SPL files or operators that are contained in the leaf directories. When creating namespace directories, follow the Namespace naming conventions for best practices.Namespace directories may contain SPL files to define types, functions, and composite operators. If the namespace in the .spl file does not match the one derived from the directory structure, Streams development tools may issue an error.
For example, the Util.spl file from the first example must have the following declaration at the top of its file:namespace my.namespace; // Type and function definitions
Namespace directories may also contain the following subdirectories:- A native.function directory
- Primitive operator directories
A native.function directory is present when custom native functions are needed in an application and contains one or more XML files (e.g., function.xml, javaFunction.xml) which declare functions and their dependencies.
Primitive operator directories are each named after the primitive operator it declares and are present when custom primitive operators are needed in an application. These directories contain an Operator Model XML file and may contain code generation template (.cgt) files if the operator is written in C++.
- Organizational directories
- Beyond namespace directories, application directories can contain common
directories to organize data, configuration files, supporting libraries, and
more.The following list is an example of common directories to organize files by:
- bin/ – Binaries and scripts
- data/ – Data files
- doc/ – Documentation
- etc/ – Configuration files
- impl/ – Native implementation files
- lib/ – Support libraries
- opt/ – Optional components
- Native implementation directory
- A native implementation directory (impl/) is an
organizational directory recommended for applications that need to implement
native functions or custom primitive operators.
While native.function and primitive operator directories within the namespace directories are used to declare artifacts for SPL usage, the native implementation directory contains the code, libraries, and scripts to make those work.
The native implementation directory follows a similar structure as the top-level organizational directories, but specially contain a src/, include/, or java/src/ directory to separate C++ and Java source code.
- Generated artifacts
- Generated artifacts are files and directories created by
compile processes. These files and directories should NOT be tracked by
source code management tools.Examples of generated artifacts include:
- C++ compiler artifacts:
- Shared object (.so) files
- Java compiler artifacts:
- Java class (.class) files
- Java build system artifacts:
- Java ARchives (.jar) files
- SPL compiler and tool artifacts:
- Code generation (_{h,cpp}.pm) files
- output/ directories
- Streams application bundles (.sab)
- toolkit.xml
- Java primitive operator directories and files
- javaFunction.xml
- C++ compiler artifacts:
Generic structure
Application directories should follow the generic structure below only including directories as needed.
[application-directory]/
├── info.xml # Information file
├── [spl-files] # Global SPL file(s)
├── bin/ # Compile-time binaries and scripts
├── data/ # Data files
├── doc/ # Documentation files
├── etc/ # Configuration files
├── lib/ # Support libraries
├── opt/ # Optional components
├── impl/ # Native implementation files
│ ├── bin/ # Runtime binaries and scripts
│ ├── java/ # Java implementation files
│ │ └── src/ # Java source code files
│ ├── lib/ # Implementation libraries
│ ├── include/ # C++ header files
│ └── src/ # C++ source code files
└── [namespace-directory]/ # Namespace directory
├── [spl-files] # Namespaced SPL file(s)
├── native.function/ # Native function declaration directory
│ ├── function.xml # C++ native function model
│ └── javaFunction.xml # Java native function model; generated
└── [operator-directory]/ # Primitive operator directory; might be generated
├── [operator].xml # Operator model; might be generated
├── [operator]_h.cgt # C++ code generation template header file
└── [operator]_cpp.cgt # C++ code generation template implementation file
Examples
appdir/
├── info.xml
├── Application.spl
└── my.namespace/
└── MyC++Operator/
├── MyC++Operator.xml
├── MyC++Operator_h.cgt
└── MyC++Operator_cpp.cgt
appdir/
├── info.xml
├── impl/
│ ├── lib/
│ │ └── libmyutils.so # Generated artifact
│ ├── include/
│ │ └── myutils.h
│ └── src/
│ └── myutils.cpp
└── my.namespace/
├── Application.spl
└── native.function/
└── function.xml
appdir/
├── info.xml
├── Application.spl
├── etc/
│ └── broker.properties
├── lib/
│ └── third-party.jar
├── impl/
│ ├── java/
│ │ └── src/
│ │ └── my/
│ │ └── namespace/
│ │ └── Operator.java
│ └── lib/
│ └── myoperator.jar # Generated artifact
└── my.namespace/
└── MyJavaOperator/
└── MyJavaOperator.xml # Generated artifact
appdir/
├── info.xml
├── data/
│ └── models.json
├── impl/
│ └── bin/
│ └── MyPythonOp.py
└── my.namespace/
├── Application.spl
└── MyPythonOperator/
└── MyPythonOperator.xml