Examples

The following example uses the HDFS2DirectoryScan operator to scan the HDFS directory every two seconds and the HDFS2FileSource operator to read the files that are output by the HDFS2DirectoryScan operator.


stream<rstring filename> Files = HDFS2DirectoryScan() {
    param
        directory : "/user/streamsadmin/works";
        hdfsUri   : "hdfs : //hdfsServer:8020";
        hdfsUser  : "streamsadmin";
        sleepTime : 2.0;
}

// HDFS2FileSource operator reads from files discovered by HDFS2DirectoryScan operator
// If the keyStorePath and keyStorePassword are omitted, the operator will accept all certificates as valid.
// Always specify keyStorePath and keyStorePassword in a non-development environment.
stream<rstring data> FileContent = HDFS2FileSource(Files) {
    param
        hdfsUri: "hdfs://hdfsSever:8020";
        hdfsUser: "streamsadmin";
        hdfsPassword: "Password";
}

The following example shows the operator configured to access a HDFS instance on IBM Analytics Engine to read a file specified by the file parameter. The hdfsUser and hdfsPassword are the username and password that have access to the Hadoop instance.


stream<rstring data> FileContent = HDFS2FileSource() {
    param
        hdfsUri: "webhdfs://server_host_name:port";
        file   : "/user/streamsadmin/myfile.txt";
        hdfsUser: "streamsadmin";
        hdfsPassword: "Password";
        keyStorePassword: "storepass";
        keyStorePath: "etc/store.jks";
}