Running distributed
Run the "NumberedCat" application in a distributed environment.
In this tutorial, you will:
- Submit the "NumberedCat" application from the Stream processing tutorial
- Verify the health of the application PEs
- Verify the application output
- Cancel the job from the distributed Streams instance.
sab
) which can be submitted to a distributed Streams
instance.Before you begin
- Complete the Stream processing tutorial.
- Obtain a started Streams domain and instance:
- If using the QSE, validate that the instance is up and running.
- If not using the QSE, use these commands to create, configure, and start
a one node domain and
instance:
streamtool mkdomain --embeddedzk # make a domain streamtool startdomain --embeddedzk # start the domain streamtool genkey # generate keys streamtool mkinstance --embeddedzk # make an instance streamtool startinstance --embeddedzk # start the runtime instance
Procedure
-
Set up your environment for Streams.
In your command terminal, source the streamsprofile.sh file under Streams installation directory. For example:
source streams-install-directory/bin/streamsprofile.sh
-
Navigate to numberedcat directory:
cd numberedcat
-
Remove the old result file:
rm -f data/result.txt
-
Submit the application to the Streams instance and make note of the job
ID.
For example, use the following command:
streamtool submitjob output/NumberedCat.sab -P file="cat.txt"
Note: For this and future streamtool commands, if you need to specify a domain name other than the default, add-d domain-name
. If you need to specify an instance name other than the default, add-i instance-name
.The command outputs several messages including what the job ID is.
-
Verify application PE health.
For example, use the following command:
streamtool lspes -j job-id
The command outputs the list of PEs for the job, their health, what resource it's running on, and more. The PE(s) should all be
Healthy
.Important: If PEs are not healthy, inspect the PE logs usingstreamtool viewlog --print --pe pe-id
. -
Verify output:
cat data/result.txt
Similar to the stand-alone run, the distributed program creates a file called result.txt in the numberedcat/data directory containing the numbered lines of cat.txt.
-
Cancel the job:
streamtool canceljob job-id
The command tells the Streams instance to stop and clean up all PEs and the job. Jobs run indefinitely until cancelled.
What to do next
If you created a Streams and instance for this tutorial and no longer need it, clean up the environment:
streamtool stopinstance --embeddedzk # stop the runtime instance
streamtool rminstance --embeddedzk # remove the runtime instance
streamtool stopdomain --embeddedzk # stop the domain
streamtool rmdomain --embeddedzk # remove the domain
Continue to the Types and functions tutorial to learn how to create user-defined SPL types and functions to improve code reuse.