Running distributed

Run the "NumberedCat" application in a distributed environment.

In this tutorial, you will:

  1. Submit the "NumberedCat" application from the Stream processing tutorial
  2. Verify the health of the application PEs
  3. Verify the application output
  4. Cancel the job from the distributed Streams instance.
Note: When the "NumberedCat" application was compiled, the SPL compiler not only created a stand-alone executable file, but also a Streams application bundle (sab) which can be submitted to a distributed Streams instance.

Before you begin

  • Complete the Stream processing tutorial.
  • Obtain a started Streams domain and instance:
    • If using the QSE, validate that the instance is up and running.
    • If not using the QSE, use these commands to create, configure, and start a one node domain and instance:
      streamtool mkdomain --embeddedzk      # make a domain
      streamtool startdomain --embeddedzk   # start the domain
      streamtool genkey                     # generate keys 
      streamtool mkinstance --embeddedzk    # make an instance
      streamtool startinstance --embeddedzk # start the runtime instance

Procedure

  1. Set up your environment for Streams.

    In your command terminal, source the streamsprofile.sh file under Streams installation directory. For example:

    source streams-install-directory/bin/streamsprofile.sh
  2. Navigate to numberedcat directory:
    cd numberedcat
  3. Remove the old result file:
    rm -f data/result.txt
  4. Submit the application to the Streams instance and make note of the job ID.

    For example, use the following command:

    streamtool submitjob output/NumberedCat.sab -P file="cat.txt"
    Note: For this and future streamtool commands, if you need to specify a domain name other than the default, add -d domain-name. If you need to specify an instance name other than the default, add -i instance-name.

    The command outputs several messages including what the job ID is.

  5. Verify application PE health.

    For example, use the following command:

    streamtool lspes -j job-id

    The command outputs the list of PEs for the job, their health, what resource it's running on, and more. The PE(s) should all be Healthy.

    Important: If PEs are not healthy, inspect the PE logs using streamtool viewlog --print --pe pe-id.
  6. Verify output:
    cat data/result.txt

    Similar to the stand-alone run, the distributed program creates a file called result.txt in the numberedcat/data directory containing the numbered lines of cat.txt.

  7. Cancel the job:
    streamtool canceljob job-id

    The command tells the Streams instance to stop and clean up all PEs and the job. Jobs run indefinitely until cancelled.

What to do next

If you created a Streams and instance for this tutorial and no longer need it, clean up the environment:

streamtool stopinstance --embeddedzk # stop the runtime instance
streamtool rminstance --embeddedzk   # remove the runtime instance
streamtool stopdomain --embeddedzk   # stop the domain
streamtool rmdomain --embeddedzk     # remove the domain

Continue to the Types and functions tutorial to learn how to create user-defined SPL types and functions to improve code reuse.