Examples

This example uses the HDFS2DirectoryScan operator to scan the HDFS directory every two seconds and the HDFS2FileSource operator to read the files that are output by the HDFS2DirectoryScan operator.

//// HDFS2DirectoryScan operator scans the /user/streamsadmin/works directory from HDFS every 2.0 seconds


(stream<rstring filename>; Files) as HDFS2DirectoryScan_1 = HDFS2DirectoryScan(){
    param
        directory     : "/user/streamsadmin/works"; 
        hdfsUri: "hdfs : //hdfsServer:8020"; 
        hdfsUser: "streamsadmin"; 
        sleepTime     : 2.0; 
}

// HDFS2FileSource operator reads from files discovered by HDFS2DirectoryScan operator
//If the `keyStorePath` and `keyStorePassword` are omitted, the operator will accept all certificates as valid
(stream<rstring data> FileContent) as HDFS2FileSource_2 =    HDFS2FileSource(Files){
     param
        hdfsUri: "hdfs://hdfsSever:8020"; 
        hdfsUser: "streamsadmin"; 
       hdfsPassword: "Password"; 
}

The following example shows the operator configured to access a HDFS instance on IBM Analytics Engine to read a file specified by the file parameter. The hdfsUser and hdfsPassword are the username and password that have access to the Hadoop instance.


stream<rstring data> FileContent) as HDFS2FileSource_2 = HDFS2FileSource(){
    param
        hdfsUri: "webhdfs://server_host_name:port"; 
        file   : "/user/streamsadmin/myfile.txt"; 
        hdfsUser: "streamsadmin"; 
        hdfsPassword: "Password"; 
        keyStorePassword: "storepass"; 
        keyStorePath: "etc/store.jks"; 
}