Examples
This example uses the HDFS2DirectoryScan operator to scan the HDFS directory every two seconds and the HDFS2FileSource operator to read the files that are output by the HDFS2DirectoryScan operator.
//// HDFS2DirectoryScan operator scans the /user/streamsadmin/works directory from HDFS every 2.0 seconds
(stream<rstring filename>; Files) as HDFS2DirectoryScan_1 = HDFS2DirectoryScan(){
param
directory : "/user/streamsadmin/works";
hdfsUri: "hdfs : //hdfsServer:8020";
hdfsUser: "streamsadmin";
sleepTime : 2.0;
}
// HDFS2FileSource operator reads from files discovered by HDFS2DirectoryScan operator
//If the `keyStorePath` and `keyStorePassword` are omitted, the operator will accept all certificates as valid
(stream<rstring data> FileContent) as HDFS2FileSource_2 = HDFS2FileSource(Files){
param
hdfsUri: "hdfs://hdfsSever:8020";
hdfsUser: "streamsadmin";
hdfsPassword: "Password";
}
The following example shows the operator configured to access a HDFS instance on IBM Analytics Engine to read a file specified by the file parameter. The hdfsUser and hdfsPassword are the username and password that have access to the Hadoop instance.
stream<rstring data> FileContent) as HDFS2FileSource_2 = HDFS2FileSource(){
param
hdfsUri: "webhdfs://server_host_name:port";
file : "/user/streamsadmin/myfile.txt";
hdfsUser: "streamsadmin";
hdfsPassword: "Password";
keyStorePassword: "storepass";
keyStorePath: "etc/store.jks";
}