Parameter reference
The following Lookup Manager and ITE application parameters enable, disable, and configure application features to suit your needs.
These parameters have the following properties:
- Type
Parameters can be integer, float, string, or enum type. The integer and float types are for numeric values.
-
Default
An optional parameter that can have a default value that is used if the parameter is omitted in the configuration file.
-
Cardinality
Specifies how many values are allowed for a parameter for compile time.
0..1 means that the parameter is optional and can take only one value.
0..n means that the parameter is optional and can take multiple comma-separated values.
1 means that the parameter is mandatory and can take only one value.
1..n means that the parameter is mandatory and can take multiple comma-separated values.
-
Application scope
Specifies which application (Lookup Manager, ITE application, or both) evaluates the parameter.
-
Provisioning time
Specifies whether the application evaluates the parameter during compile time, submission time, or both.
Normally, if the compile time parameter is not provided or if a default is overridden with an empty value, the submission-time parameter is mandatory. Otherwise, the compile-time parameter is used as a default for the submission-time parameter.
-
Valid values
For enumerations, the list of supported named values is provided. The named values are case-insensitive, which means that you can specify, for example, ite.embeddedSampleCode=off or ite.embeddedSampleCode=OFF.
For numeric values, you can provide a value that fits to the constraint. For example, a constraint might be >=1 (global.multihost.numberOfHosts). If you provide a value that is >= 1, your value is accepted. If you provide 0 or a negative value, you see an error message.
For string values, you can provide a value that matches the Perl regular expression. For example, the description of ite.cleanup.schedule.minute shows the (([1-5]?[0-9]-)?[1-5]?[0-9]) regular expression. You must provide a value that matches this regular expression, for example, 9-59 or 10.
-
Related parameters
Some parameters are related to others. For example, a parameter that can be switched on and off may have sub-parameters. If the parameter is switched off, sub-parameters are inactive. Or, if a parameter has a certain value, it may require that another parameter is either not present or also has a certain value.
-
Details
For some parameters, technical details are provided. For example, a parameter enables customized code that is stored in a certain composite operator, or administrative actions are required.
Note: In this topic, <namespace> is the namespace of the application. This namespace was specified when you create an application project with the wizard or the teda-create-project script.
global.applicationControlDirectory
Specifies the path of the directory that is used by the applications to store and exchange status information. The same path must be used for the Lookup Manager application and its controlled ITE applications.
If the applications are running on multiple hosts, the directory must be located in a shared file system.
A relative path is relative to the data directory.
Properties
Type: string
Cardinality: 1
Application scope: ITE, Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
global.multiHost
Specifies whether the application bundle shall run on a single or multiple hosts. An application bundle can consist of a single ITE application or of a single Lookup Manager application with multiple ITE applications.
If you want to run the application bundle on multiple hosts, turn the parameter on. If you want to run the application bundle on a single host only, turn it off.
If the parameter is turned off, the child parameters are inactive.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE, Lookup Manager
Provisioning time: compile time
Valid values: off, on
- Children: global.multiHost.customHostTags, global.multiHost.numberOfHosts
Details
If the parameter is turned on, the application uses host tags to ensure, for example, that the enrichment data is updated on every host. The required host tags are stored in the hosttags.txt file, which is located in the application config directory. The host tags must be created and assigned to hosts using the streamtool command, for example, streamtool mktag or streamtool chhost.
global.multiHost.customHostTags
Specifies host tags that you want to use in your customized code to place operators on specific hosts.
The parameter is active only if the parent parameter is turned on.
Properties
Type: string
Default: empty list
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of values that match the \w+ regular expression
- Parent: global.multiHost
- Other: global.multiHost.numberOfHosts
Details
The provided host tags are stored in the hosttags.txt file, which is located in the application config directory. The host tags must be created and assigned to hosts using the streamtool command, for example, streamtool mktag or streamtool chhost.
global.multiHost.numberOfHosts
Specifies the number of hosts that will hold enrichment data. This number must be identical to the number of hosts that are assigned the <namespace>_lookup_host_writer host tag.
The parameter is active only if the parent parameter is turned on.
Properties
Type: integer
Default: 1
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any integer value that is equal to or greater than 1
- Parent: global.multiHost
- Other: global.multiHost.customHostTags
Details
The application uses the UDP feature to create as many operator instances as needed, initializing and updating the enrichment data on every host. Each host has its own operator instance. In other words, a host exlocation is used.
If the number of hosts that have the <namespace>_lookup_host_writer host tag assigned is less than this parameter value, the job submission fails. If it is greater, it is not predictable which hosts hold the enrichment data.
ite.archive.inputFilesIntoDateDirectory
Specifies whether the ITE application archives processed input files in a per-day directory or in a directory that receives all files.
If the parameter is off, the archive directory receives all files. The archive directory is relative to the data directory.
If the parameter is on, the ITE application creates a directory for every day that receives the processed input files for that day. The directory path is archive/YYYYMMDD with YYYY as year, MM as the month and DD as the day. The archive/YYYYMMDD directory is relative to the data directory.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
ite.businessLogic.group
Specifies whether tuples are grouped.
If the parameter is off, the ITE application does not group tuples.
If the parameter is on, the ITE application groups tuples, and at least one of the built-in correlations must be enabled. This means that either the tuple deduplication, the custom correlation, or both must be enabled.
CAUTION: If the checkpointing for the group logic is enabled, the ITE applications will regularly run internal maintenance tasks that pause the file processing for few seconds till several minutes.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Children: ite.businessLogic.group.custom, ite.businessLogic.group.debug, ite.businessLogic.group.deduplication, ite.businessLogic.group.startupControlFile, ite.businessLogic.group.tap, ite.fuse.group.operators, ite.fuse.groupWithChain.operators
ite.businessLogic.group.custom
Specifies whether the ITE application groups tuples by using the custom correlation logic.
If you want to group tuples using your correlation logic, set this and the parent parameter to on and implement your correlation logic in the <namespace>.context.custom::ContextDataProcessor composite operator. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.context.sample::ContextDataProcessor composite operator.
CAUTION: If the checkpointing for the group logic is enabled, the ITE applications will regularly run internal maintenance tasks that pause the file processing for few seconds till several minutes.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group
- Children: ite.businessLogic.group.custom.checkpointing
- Other: ite.businessLogic.group.debug, ite.businessLogic.group.deduplication, ite.businessLogic.group.startupControlFile, ite.businessLogic.group.tap, ite.embeddedSampleCode, ite.fuse.group.operators, ite.fuse.groupWithChain.operators
ite.businessLogic.group.custom.checkpointing
Specifies whether checkpoint files for the custom logic of group processing are stored. If this parameter is off, the state of the custom logic cannot be recovered if the application is restarted. For example, if your custom logic aggregates data across file boundaries, data that has been collected is lost.
Committed checkpoint files are named custom/<groupId>/committed/<input-filename>.bin and are located in the output directory that is specified in the ite.checkpointing.directory parameter.
CAUTION: If the checkpointing for the group logic is enabled, the ITE applications will regularly run internal maintenance tasks that pause the file processing for few seconds till several minutes.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group.custom
- Children: ite.businessLogic.group.custom.timeToKeep
Details
The ite.businessLogic.group.custom.timeToKeep parameter is active only if the parent and this parameters are set to on.
ite.businessLogic.group.custom.timeToKeep
Specifies the time after which tuples are removed from the stateful custom group.
This parameter is active only if the parent and the ite.businessLogic.group.custom.checkpointing parameters are set to on.
Properties
Type: string
Default: "1d"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the (\d+d)?\s*(\d+h)?\s*(\d+m)? regular expression
- Parent: ite.businessLogic.group.custom.checkpointing
Details
If the ite.businessLogic.group.custom.checkpointing parameter is on, the ITE application automatically saves all tuples that are received by the custom correlation logic to the hard disk. If the application restarts, for example because of maintenance or an automatic data refreshment and eviction cycle, the ITE application removes old tuples from the saved tuple set and processes only the valid tuples to rebuild an updated state of the custom correlation logic.
ite.businessLogic.group.debug
Enables additional file outputs that troubleshoot your ITE application. The files are located in the debug directory, which is a subdirectory of the configured data directory.
When this parameter is on, you receive information about the commands and data that are processed in the group logic.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group
- Other: ite.businessLogic.group.custom, ite.businessLogic.group.deduplication, ite.businessLogic.group.startupControlFile, ite.businessLogic.group.tap, ite.fuse.group.operators, ite.fuse.groupWithChain.operators
Details
The following files are created only if the ite.businessLogic.group and ite.businessLogic.group.debug parameters are turned on:
- CONTEXT_CMD_<GROUP_ID>.txt: Receives log entries for internal checkpoint commands (clear, read, write) that are received by the group logic.
- CONTEXT_CMD_RESP_<GROUP_ID>.txt: Receives log entries for start and stop responses that leave the group logic.
- CONTEXT_DATA_IN_<GROUP_ID>.txt: Receives log entries for data tuples that are received by the group logic.
- CONTEXT_DATA_OUT_<GROUP_ID>.txt: Receives log entries for valid data tuples that leave the group logic.
- DEDUP_CMD_<GROUP_ID>.txt: Receives log entries for refresh and shutdown signals that are received by the deduplication.
- DEDUP_CMD_RESP_<GROUP_ID>.txt: Receives log entries for refresh and shutdown responses that leave the deduplication.
- DEDUP_IN_<GROUP_ID>.txt: Receives log entries for data tuples that are received by the deduplication.
- DEDUP_OUT_<GROUP_ID>.txt: Receives log entries for data tuples that leave the deduplication and sets whether the tuple is unique or a duplicate.
- BLOOM_OUT_<GROUP_ID>.txt: Receives log entries for data tuples that leave the deduplication during the training phase that starts during the initialization phase or after receiving a refresh signal.
ite.businessLogic.group.deduplication
Specifies whether the ITE application groups tuples according to the built-in deduplication logic.
To enable the tuple deduplication, set this and the parent parameter to on.
CAUTION: If the checkpointing for the group logic is enabled, the ITE applications will regularly run internal maintenance tasks that pause the file processing for few seconds till several minutes.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group
- Children: ite.businessLogic.group.deduplication.checkpointing, ite.businessLogic.group.deduplication.probability
- Other: ite.businessLogic.group.custom, ite.businessLogic.group.debug, ite.businessLogic.group.startupControlFile, ite.businessLogic.group.tap, ite.fuse.group.operators, ite.fuse.groupWithChain.operators
Details
The deduplication uses a memory-efficient algorithm that can lead to false positives, which means that unique tuples are marked as duplicates. For more information, see the child parameters or the BloomFilter operator.
ite.businessLogic.group.deduplication.checkpointing
Specifies whether to store checkpoint files for the deduplication of the group processing. If this parameter is off, the state of the deduplication cannot be recovered if the application is restarted. For example, unique tuples are not restored in the deduplication logic anymore, so duplicate tuples would be detected as unique tuples.
The committed checkpoint files are named <groupId>/committed/<input-filename>.chk and are located in the output directory that is specified in the ite.checkpointing.directory parameter.
CAUTION: If the checkpointing for the group logic is enabled, the ITE applications will regularly run internal maintenance tasks that pause the file processing for few seconds till several minutes.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group.deduplication
- Children: ite.businessLogic.group.deduplication.timeToKeep
- Other: ite.businessLogic.group.deduplication.probability
Details
The ite.businessLogic.group.deduplication.timeToKeep parameter is active only if the parent and this parameter is set to on.
ite.businessLogic.group.deduplication.partitioning
details => <<STOP,
For more details about the partitioning, see the BloomFilter operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Children: ite.businessLogic.group.deduplication.partitioning.count, ite.businessLogic.group.deduplication.partitioning.searchAllPartitions
- Other: ite.businessLogic.group.deduplication.checkpointing, ite.checkpointing.directory, ite.ingest.loadDistribution.groupConfigFile
ite.businessLogic.group.deduplication.partitioning.count
Specifies the maximum number of partitions. As soon as the number of active partitions exceeds this count, the partition with the minimum partitionId expression value is evicted.
Properties
Type: integer
Default: 1
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any integer value that is equal to or greater than 1
- Parent: ite.businessLogic.group.deduplication.partitioning
- Other: ite.businessLogic.group.deduplication.partitioning.searchAllPartitions
Details
For more details about the partitioning and the counter, see the BloomFilter operator.
ite.businessLogic.group.deduplication.partitioning.searchAllPartitions
Specifies whether the unique/duplicate detection algorithm evaluates all partitions or only the partition that is selected, as defined and described in the BloomFilter description. If the parameter is switched on, the algorithm evaluates all partitions. If the tuple is evaluated to be a unique in the partition that is selected with the "partitionId", the number of stored uniques is increased for this partition even if the tuple is marked as duplicated because of another partition.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group.deduplication.partitioning
- Other: ite.businessLogic.group.deduplication.partitioning.count
Details
For more details about the partitioning and the search options, see the BloomFilter operator.
ite.businessLogic.group.deduplication.probability
Specifies the probability of false positives that are allowed for duplicate detection.
A false positive occurs when a tuple is marked as a duplicate even though it is unique.
The expected number of unique tuples, for which this probability is ensured, is specified in the file that is specified in the ite.ingest.loadDistribution.groupConfigFile parameter.
Properties
Type: float
Default: 0.001
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any float value from 0 to 0.1, inclusive
- Parent: ite.businessLogic.group.deduplication
- Other: ite.businessLogic.group.deduplication.checkpointing, ite.ingest.loadDistribution.groupConfigFile
Details
For more details about the probability and the number of expected unique tuples, see the BloomFilter operator.
ite.businessLogic.group.deduplication.timeToKeep
Specifies the time after which tuples are removed from the stateful deduplication.
The parameter is active only if the parent and the ite.businessLogic.group.deduplication.checkpointing parameters are set to on.
Properties
Type: string
Default: "1d"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the (\d+d)?\s*(\d+h)?\s*(\d+m)? regular expression
- Parent: ite.businessLogic.group.deduplication.checkpointing
Details
If the ite.businessLogic.group.deduplication.checkpointing parameter is on, the ITE application automatically saves all tuples that are received by the deduplication logic to the hard disk. If the application restarts, for example because of maintenance or an automatic data refreshment and eviction cycle, the ITE application removes old tuples from the saved tuple set and processes valid tuples to rebuild an updated state of the deduplication logic.
ite.businessLogic.group.startupControlFile
Specifies the name of the text file that delays the initialization of the ITE application. As soon as the file exists and contains the done value in the first row, the initialization begins.
You use this file to indicate completed external activities that are required before the ITE application starts its initialization, for example, creating files that are needed for the custom or deduplication initialization from a database.
The specified file is expected in the control directory that is identified by the global.applicationControlDirectory parameter.
Properties
Type: string
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the [^\/]+ regular expression
- Parent: ite.businessLogic.group
- Other: global.applicationControlDirectory, ite.businessLogic.group.custom, ite.businessLogic.group.debug, ite.businessLogic.group.deduplication, ite.businessLogic.group.tap, ite.fuse.group.operators, ite.fuse.groupWithChain.operators
ite.businessLogic.group.tap
Turns the post-group data processor tap on or off.
If this tap is turned on, another stream that contains the tuples that passed the business logic, including the group logic (for example, deduplication), is activated. You may use these tuples to implement features that do not alter the data stored in the files by the main business logic. For example, the tap logic filters for tuples and sends an event to another application or another system if the filter condition is met. The spl.adapter::Export operator or any sink operator like the spl.adapter::TCPSink operator may be used with the tap data tuples.
Implement your tap logic in the <namespace>.tap.custom::PostContextDataProcessorTap composite operator. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.tap.sample::PostContextDataProcessorTap composite operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group
- Other: ite.businessLogic.group.custom, ite.businessLogic.group.debug, ite.businessLogic.group.deduplication, ite.businessLogic.group.startupControlFile, ite.businessLogic.transformation.tap, ite.embeddedSampleCode, ite.fuse.group.operators, ite.fuse.groupWithChain.operators
Details
- The first tap is turned on with the ite.businessLogic.transformation.tap parameter and normally used only if the ite.businessLogic.group parameter is turned off.
- The second tap is turned on with the ite.businessLogic.group.tap parameter and normally used only if the ite.businessLogic.group parameter is turned on.
ite.businessLogic.sink.debug
Specifies whether to enable additional file outputs that are used to troubleshoot your ITE application. The files are located in the debug directory, which is a subdirectory of the configured data directory.
When this parameter is set to on, you receive information about the storage stage.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
Details
The following files are created:
- CHAIN_TRANSFORMER_IN_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for data tuples that are received by the <namespace>.chainprocessor.transformer::ChainprocessorTransformerCore composite operator.
- CHAIN_TRANSFORMER_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for data tuples that are sent by the <namespace>.chainprocessor.transformer::ChainprocessorTransformerCore composite operator.
- CHAIN_TRANSFORMER_STAT_IN_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for statistics tuples that are received by the <namespace>.chainprocessor.transformer::ChainprocessorTransformerCore composite operator at the end of each file.
- CHAIN_TRANSFORMER_STAT_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for enriched statistics tuples that are sent by the <namespace>.chainprocessor.transformer::ChainprocessorTransformerCore composite operator at the end of each file.
ite.businessLogic.transformation.debug
Specifies whether to enable additional file outputs that are used to troubleshoot your ITE application. The files are located in the debug directory, which is a subdirectory of the configured data directory.
When this parameter is set to on, you receive information about the transformation stage.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
Details
Following files are created:
- SINK_FILE_WRITER_STAT_IN_<GROUP_ID>_<CHAIN_ID>.txt: statistic tuple sent by FileReader at end of file
- SINK_FILE_WRITER_IN_<GROUP_ID>_<CHAIN_ID>.txt: data tuples to write to file at RecordFileWriter or TableFileWriter
- CHAIN_POSTCONTEXT_IN_<GROUP_ID>_<CHAIN_ID>.txt: data tuples received from context
- CHAIN_POSTCONTEXT_OUT_<GROUP_ID>_<CHAIN_ID>.txt: data tuples sent to FileWriter Sink
- CHAIN_POSTCONTEXT_STAT_IN_<GROUP_ID>_<CHAIN_ID>.txt: statistic tuple sent by FileReader at end of file
- CHAIN_POSTCONTEXT_STAT_OUT_<GROUP_ID>_<CHAIN_ID>.txt: statistic tuple sent by FileReader at end of file
ite.businessLogic.transformation.lookup
Specifies whether the ITE application performs data enrichment using the lookup functionality.
If you want to use the lookup functionality, set the parameter to on. If not, set the parameter to off. In this case, the ITE application runs independently of the Lookup Manager application.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: off, on
- Other: ite.ingest.reader.schemaExtensionForLookup
Details
The ite.ingest.reader.schemaExtensionForLookup parameter is related because all attributes that are introduced in lookup are added already to the stream definition for the parser output when ite.ingest.reader.schemaExtensionForLookup is switched on. This setting creates a streams schema that is used throughout the application (beginning to end).
The Lookup Manager application controls the initialization and updates of the enrichment data. During the initialization and the updates, the ITE application is paused.
ite.businessLogic.transformation.outputType
Specifies the output schema of the <namespace>.chainprocessor.transformer::ChainprocessorTransformerCore composite that is handled by the <namespace>.streams::TypesCommon.TransformerOutType while considering the value of the ite.storage.type parameter. The streams are defined in "TypesCommon" and "TypesCustom" and used in the "DataProcessor" composites.
If tuple deduplication is enabled, the hash code must be part of the defined tuple.
Valid values of this parameter are:
- tableStream: This output stream becomes the input of the TableRowGenerator. One tuple contains a single table row and one hash code for deduplication. If an input record results in multiple table rows or input to different tables, several tuples must be sent by the Transformer.
- extendedTableStream: Extends the table schema, for example, if lookup data is evaluated in custom PostDedupProcessor or in CustomContext. This is all that the 'tableStream' selection is extended with the <namespace>.streams::TypesCustom.ExtendedTableStream or <namespace>.streams.custom::TypesCustom.ExtendedTableStream streams.
- recordStream: Enables the RecordStreamType that contains the TransformedRecord tuple. It is used when ite.storage.type is set to 'recordFile' or 'custom'. The PostContextDataProcessor composte creates the row tuples.
Properties
Type: enum
Default: recordStream
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: extendedTableStream, recordStream, tableStream
- Other: ite.storage.type
ite.businessLogic.transformation.postprocessing.custom
Enables the custom logic that runs after the group processing but before the storage stage.
If you want to implement this custom logic, set this parameter to on and adapt the <namespace>.chainsink.custom::PostContextDataProcessor composite. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.chainsink.sample::PostContextDataProcessor composite operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.embeddedSampleCode
ite.businessLogic.transformation.tap
Turns the post-transformation data processor tap on or off.
If this tap is turned on, another stream that contains the tuples that passed the business logic, excluding the group logic (for example, deduplication), is activated. You may use these tuples to implement features that do not alter the data stored in the files by the main business logic. For example, the tap logic filters for tuples and sends an event to another application or another system if the filter condition is met. The spl.adapter::Export operator or any sink operator like the spl.adapter::TCPSink operator may be used with the tap data tuples.
Implement your tap logic in the <namespace>.tap.custom::TransformerTap composite operator. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.tap.sample::TransformerTap composite operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.businessLogic.group.tap, ite.embeddedSampleCode
Details
- The first tap is turned on with the ite.businessLogic.transformation.tap parameter and normally used only if the ite.businessLogic.group parameter is turned off.
- The second tap is turned on with the ite.businessLogic.group.tap parameter and normally used only if the ite.businessLogic.group parameter is turned on.
ite.businessLogic.transformation.tupleGroupSplit
Enables tuple grouping based on tuple attributes to increase parallelization, improve throughput, or overcome memory limitations.
For example, you want to run deduplicatation on several billion unique records. Even with memory-efficient deduplication, you exceed the available memory. Tuple grouping allows you to build smaller record subsets that are distributed to different instances of the deduplication logic on different hosts. The tuple grouping also ensures that tuples with the same identification, also called group ID, are routed to the same instance. The memory requirement for deduplication that runs with a subset of records is less than the memory requirement for deduplication that runs with the complete record set.
If this parameter is set to on, tuple grouping based on tuple attributes is enabled.
As a developer, you implement your custom business logic in the <namespace>.chainprocessor.transfomer.custom::DataProcessor composite. As part of this implementation, you provide the destination group ID in the groupID SPL output attribute. The groupID is a 2-digit rstring attribute that supports a range from 00 to 99. The default groupId value is 00. Tuples that have the same identification must result in the same groupID value. For example, a key attribute of the tuple has a range from 0 to 255. You want to divide this range into two subranges, 0 to 127 and 128 to 255. If the key attribute is in the first range, you provide the 00 groupID. If it is in the second range, you provide the 01 groupID.
If this parameter is set to on, the ite.businessLogic.group parameter must be set to on, and the ite.ingest.fileGroupSplit parameter must be set to off. In other words, this parameter can only be set to on for an ITE application that uses variant B. For ITE applications that use variant A or C, this parameter must be set to off.
Properties
Type: enum
Cardinality: 1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.businessLogic.group, ite.ingest.fileGroupSplit
Details
When you create an SPL project using the teda-create-project command line tool, you selected a variant for your ITE application. The wizard or command line tool set this parameter to the value that is appropriate for your selected variant. Typically, you do not change this value.
ite.checkpointing.directory
Specifies the directory that receives checkpoint files.
A relative path is relative to the data directory.
For more information about the checkpoint files, see the related parameters.
Properties
Type: string
Default: "./checkpoint"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
- Other: ite.businessLogic.group.custom.checkpointing, ite.businessLogic.group.deduplication.checkpointing
ite.cleanup.schedule.dayOfMonth
Specifies the day or days of the month on which automated cleanup operations run. To enable automated cleanup operations, the other schedule parameters must also be specified.
Automated cleanup operations are required, for example, to remove old information from the file or tuple deduplication.
See the ScheduledBeacon operator for more information about the schedule.
Properties
Type: string
Default: empty list
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: a comma-separated list of values that match the (([1-2]?[0-9]|3[01])-)?([1-2]?[0-9]|3[01]) regular expression
Details
Before the automated cleanup runs, the ITE application suspends file processing. The cleanup operations can run from a few seconds to several hours, depending on your configuration and, for example, the amount of active records in your deduplication logic.
ite.cleanup.schedule.dayOfWeek
Specifies the day or days of the week on which automated cleanup operations run. To enable automated cleanup operations, the other schedule parameters must also be specified.
Automated cleanup operations are required, for example, to remove old information from the file or tuple deduplication.
See the ScheduledBeacon operator for more information about the schedule.
Properties
Type: enum
Default: *
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: a comma-separated list of the following values: *, 0, 1, 2, 3, 4, 5, 6, Fri, Friday, Mon, Monday, Sat, Saturday, Sun, Sunday, Thu, Thursday, Tue, Tuesday, Wed, Wednesday
Details
Before the automated cleanup runs, the ITE application suspends file processing. The cleanup operations can run from a few seconds to several hours, depending on your configuration and, for example, the amount of active records in your deduplication logic.
ite.cleanup.schedule.hour
Specifies the hour or hours of the day during which automated cleanup operations run. To enable automated cleanup operations, the other schedule parameters must also be specified.
Automated cleanup operations are required, for example, to remove old information from the file or tuple deduplication.
See the ScheduledBeacon operator for more information about the schedule.
Properties
Type: string
Default: "0"
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: a comma-separated list of values that match the (([0-9]|1[0-9]|2[0-3])-)?([0-9]|1[0-9]|2[0-3]) regular expression
Details
Before the automated cleanup runs, the ITE application suspends file processing. The cleanup operations can run from a few seconds to several hours, depending on your configuration and, for example, the amount of active records in your deduplication logic.
ite.cleanup.schedule.minute
Specifies the minute or minutes of the hour at which automated cleanup operations run. To enable automated cleanup operations, the other schedule parameters must also be specified.
Automated cleanup operations are required, for example, to remove old information from the file or tuple deduplication.
See the ScheduledBeacon operator for more information about the schedule.
Properties
Type: string
Default: "0"
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: a comma-separated list of values that match the ([1-5]?[0-9]-)?[1-5]?[0-9] regular expression
Details
Before the automated cleanup runs, the ITE application suspends file processing. The cleanup operations can run from a few seconds to several hours, depending on your configuration and, for example, the amount of active records in your deduplication logic.
ite.control.debug
Enables additional file outputs that are used to troubleshoot your ITE application. The files are located in the debug directory, which is a subdirectory of the configured data directory.
If this parameter is set to on, you get information about the status and status changes of the ITE application.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.businessLogic.group, ite.cleanup.schedule.dayOfMonth, ite.cleanup.schedule.dayOfWeek, ite.cleanup.schedule.hour, ite.cleanup.schedule.minute
Details
If this parameter is enabled, the following files are created:
- CONTROLLER_APPL_CTRL_OUT.txt: Receives log entries for each start or stop command that is sent to the chains.
- CONTROLLER_APPL_CTRL_RESP_IN.txt: Receives log entries for each start or stop response.
- CONTROLLER_CONTEXT_CTRL_OUT.txt: Receives log entries for each shutdown or refresh signal that is sent to the group logic. This file is created only if the ite.businessLogic.group parameter is enabled.
- CONTROLLER_CONTEXT_READY_IN.txt: Receives log entries for each shutdown or refresh response. This file is created only if the ite.businessLogic.group parameter is enabled.
- CONTROLLER_FILE_INGEST_CLEANUP_OUT.txt: Receives log entries for the initialization phase and for the automated cleanup operations that are scheduled with, for example, the ite.cleanup.schedule.dayOfMonth parameter.
- CONTROLLER_FILE_INGEST_CTRL_OUT.txt: Receives log entries for the start of the file ingestion.
ite.embeddedSampleCode
Activates sample code in created ITE projects. By default, this parameter is enabled (on), creating projects with a ready-to-run implementation. When coding custom code starts for the custom namespace composites, this parameter must be disabled. If you disable the parameter, you must also assign your parsers to ite.ingest.reader.parserList.
If this parameter is set to on, all customized code is disabled.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.businessLogic.group.custom, ite.businessLogic.group.tap, ite.businessLogic.transformation.postprocessing.custom, ite.ingest.customFileTypeValidator, ite.ingest.reader.preprocessing, ite.ingest.reader.schemaExtensionForLookup, ite.storage.auditOutputs, ite.storage.rejectWriter.custom
ite.export.streams
Configures one or more interfaces to export a stream by connecting the spl.adapter::Export operator to this output port, making it available to spl.adapter::Import operators of applications that are running in the same streaming middleware instance. The spl.adapter::Export operators in the ITE application are configured to prevent back-pressure. In case any importing client is not keeping up, data is lost since the connection is dropped and reconnected automatically.
The following interfaces are supported:
Value |
Export property |
Exported SPL Schema |
---|---|---|
reader |
ite="<namespace>.chainprocessor.reader_output_RecordValidator" |
TypesCommon.ReaderOutStreamType |
transformer |
ite="<namespace>.chainprocessor.transformer_output_DataProcessor" |
TypesCommon.TransformerOutType |
writer |
ite="<namespace>.chainsink_input_Writer" |
TypesCommon.ChainSinkStreamType |
dedup |
ite="<namespace>.context_output_Dedup" |
TypesCommon.TransformerOutType |
In your custom application, the output stream of the Import operator needs to use the selected schema from the table and the export property must be set as subscription parameter.
- The parameter value reader,writer selects two interfaces to be exported in each chain.
- The parameter value dedup selects the output stream of the BloomFilter operator to be exported in each group.
NOTE: If interface dedup is selected the configuration parameter ite.businessLogic.group.deduplication must be set to on. Otherwise the value dedup is ignored.
Properties
Type: enum
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of the following values: dedup, reader, transformer, writer
ite.fuse.chain.operators
- <namespace>.chainprocessor.reader
- <namespace>.chainprocessor.reader.custom
- <namespace>.chainprocessor.transformer
- <namespace>.chainprocessor.transformer.custom
- <namespace>.chainsink
- <namespace>.chainsink.custom
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.fuse.group.operators, ite.fuse.groupWithChain.operators
ite.fuse.group.operators
- <namespace>.context
- <namespace>.context.custom
- <namespace>.housekeeping.context.custom
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group
- Other: ite.businessLogic.group.custom, ite.businessLogic.group.debug, ite.businessLogic.group.deduplication, ite.businessLogic.group.startupControlFile, ite.businessLogic.group.tap, ite.fuse.chain.operators, ite.fuse.groupWithChain.operators, ite.fuse.groupWithChain.operators
ite.fuse.groupWithChain.operators
- <namespace>.chainprocessor.reader
- <namespace>.chainprocessor.reader.custom
- <namespace>.chainprocessor.transformer
- <namespace>.chainprocessor.transformer.custom
- <namespace>.chainsink
- <namespace>.chainsink.custom
- <namespace>.context
- <namespace>.context.custom
- <namespace>.housekeeping.context.custom
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.businessLogic.group
- Other: ite.businessLogic.group.custom, ite.businessLogic.group.debug, ite.businessLogic.group.deduplication, ite.businessLogic.group.startupControlFile, ite.businessLogic.group.tap, ite.fuse.chain.operators, ite.fuse.group.operators, ite.fuse.group.operators
ite.ingest.archiveMode
Specifies the base directory that is used for the following subdirectories:
- archive: Receives successfully processed input files.
- duplicate: Receives duplicate input files (files that are already processed).
- invalid: Receives files that do not match the allowed file types and formats.
- failed: Receives files with which unexpected problems occurred and that are not automatically resolved.
If you set this parameter to single, then the ite.ingest.directory.input parameter is used as base directory.
In case ite.ingest.directory.inputListFile contains multiple directories and ite.ingest.archiveMode is set to multiple the subdirectories are created to the corresponding input directory.
Properties
Type: enum
Default: single
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: multiple, single
- Other: ite.ingest.directory.inputListFile
ite.ingest.customFileControl
Enables fine grained control over the filename distribution mechanism. If this parameter is set to true, the custom composite operator <namespace>.fileingestion.custom::FileControl needs to be implemented. It receives all acknowledgement tuples from the chain processors, after a file is processed. The developer can delay these acknowledgement tuples to control when a certain chain starts processing the next input file. The type of the acknowledgement tuples is <namespace>.streams::TypesCommon.AcknowledgedFilesType It contains the chain number, the filename and a few other attributes (see TypesCommon.splmm for details).
This option can only be used if the ite.ingest.loadDistribution parameter is set to equalLoad.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.ingest.loadDistribution
ite.ingest.customFileTypeValidator
Enables file-type validation. File-type validation distinguishes between different file types and data formats, for example CSV or ASN.1. Depending on the determined file type, the ITE application sends the file name to the appropriate parse logic.
If file-type validation is turned off, every file is processed. Only one parse logic exists that processes all files.
If the file-type validation is turned on, file names are determined to be valid or invalid. If a file is invalid, it is not processed but logged as invalid and moved to the invalid directory, which is a subdirectory of the input directory that is specified with the ite.ingest.directory.input parameter.
If the filename is valid, a unique file type ID is stored in the fileType SPL output attribute of the <namespace>.fileingestion.custom::FileTypeValidator composite operator. As a developer, you want to implement an algorithm that validates the file name and determines the file type in the <namespace>.fileingestion.custom::FileTypeValidator composite operator. To activate your algorithm, set this parameter to on. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.fileingestion.sample::FileTypeValidator composite operator.
The unique file type IDs that can occur as a result of your algorithm must be consistent with the types that are specified with the ite.ingest.reader.parserList parameter. Any inconsistency is reported as soon as it occurs, either leading to an unhealthy processing element or a log message for this file, depending on the ite.resilienceOptimization parameter.
The easiest algorithm checks for a file name pattern. A more complicated algorithm could read and analyze the file contents.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.embeddedSampleCode, ite.ingest.reader.parserList, ite.resilienceOptimization
ite.ingest.debug
Enables additional file outputs that are used to troubleshoot your ITE application. The files are located in the debug directory, which is a subdirectory of the configured data directory.
When this parameter is on, you get information about file detection.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
Details
If this parameter is enabled, the following files are created:
- FILEINGESTION_DROPPED_FILES.txt: Receives log entries for files that are dropped because their names are either invalid or duplicates.
- FILEINGESTION_FILES.txt: Receives log entries for files that have valid and unique filenames.
- FILEINGESTION_IN_ACK_FILES.txt: Receives log entries for files that are processed and commit themselves to the file name deduplication logic.
- FILEINGESTION_IN_CTRL.txt: Receives log entries for start and stop commands that enable or disable the directory scan.
- FILEINGESTION_OUT_FILES.txt: Receives log entries for files that must be processed. For an ITE application in variant C, the groupID SPL attribute is set. This attribute distributes this tuple to the correct group logic instance.
- RawFiles_<sequence>.txt: Receives log entries for each detected input file. After 100,000 entries, a new log file is created with an incremented sequence number.
ite.ingest.deduplication
Enables file name deduplication.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Children: ite.ingest.deduplication.reprocessFilePattern, ite.ingest.deduplication.timeToKeep
- Other: ite.ingest.directoryScan.processFilePattern
ite.ingest.deduplication.reprocessFilePattern
Defines the file name pattern for files to reprocess. Matching file names bypass the duplicate check of the file ingestion logic, and the files are processed again. The pattern should not match the same set of files as the pattern configured for parameter ite.ingest.directoryScan.processFilePattern, because this would allow all processed files to bypass the duplicate check.
Properties
Type: string
Default: ""
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any string
- Parent: ite.ingest.deduplication
- Other: ite.ingest.deduplication.timeToKeep
Details
This parameter is active only if the parent parameter, ite.ingest.deduplication, is set to on.
ite.ingest.deduplication.timeToKeep
Specifies the time after which a file name is removed from the set of unique file names in the file name deduplication logic.
Properties
Type: string
Default: "1d"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the (\d+d)?\s*(\d+h)?\s*(\d+m)? regular expression
- Parent: ite.ingest.deduplication
- Other: ite.ingest.deduplication.reprocessFilePattern
Details
This parameter is active only if the parent parameter, ite.ingest.deduplication, is set to on.
ite.ingest.directory.input
Specifies the path of the directory that receives the input files. A relative path is relative to the data directory.
The input files must occur in this directory as a result of an atomic action. In other words, it is recommended that you move input files into this directory instead of copying or creating them. Over time, copying or creating input files might result in incompletely processed or failed files.
Properties
Type: string
Default: "./in"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
Details
The ITE application creates following subdirectories during the startup phase:
- archive: Receives successfully processed input files.
- duplicate: Receives duplicate input files (files that are already processed).
- invalid: Receives files that do not match the allowed file types and formats.
- failed: Receives files with which unexpected problems occurred and that are not automatically resolved.
- reprocess: Contains files that will be reprocessed, for example after a correction. Move the necessary files into this directory.
ite.ingest.directory.inputListFile
Configures the path to the file that contains a list of several input directories. This file is a text file that contains one absolute or relative directory path per line. Comment lines start with a pound symbol ('#') in column 1. The list must not contain duplicates. This parameter is optional.
If this parameter is used, all files from the first directory in the list are considered urgent files. Urgent files are queued in a separate file queue, which has precedence over the normal file queue.
Properties
Type: string
Default: ""
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: any string
ite.ingest.directoryScan.nanoSecondsPrecision
Enables scanning of files with nanosecond precision. When this parameter is turned off, all nanoseconds fields are set to zero in the directory scanner. If your file system does not support nanosecond precision, this parameter can be turned off.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
ite.ingest.directoryScan.processFilePattern
Defines a file name pattern. The directory scanner reports matching file names to the following ingestion logic. If file name deduplication is turned on, these files are checked to determine whether they have been processed. If so, the files are moved to the duplicate files folder.
Properties
Type: string
Default: ".*\.DAT$"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the .+ regular expression
- Other: ite.ingest.deduplication
ite.ingest.directoryScan.sleepTime
Specifies the time (in seconds) after each directory scan. This parameter optimizes the scan load. For example, there is no need to scan the input directories every second if new files arrive only once per hour.
Properties
Type: float
Default: 5.0
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any float value from 1 to 3600, inclusive
ite.ingest.directoryScan.sort
Specifies the sort mode for file name tuples.
If this parameter is set to off, sorting is disabled, in contrast to the spl.adapter::DirectoryScan operator, which always sorts by file time.
If this parameter is set to ascending, file name tuples are sorted in ascending order. The sort attribute must be provided in the ite.ingest.directoryScan.sort.attribute parameter. The sort window is one scan cycle of the directory scanner.
If the parameter is set to descending, file name tuples are sorted in descending order. The sort attribute must be provided in the ite.ingest.directoryScan.sort.attribute parameter. The sort window is one scan cycle of the directory scanner.
If this parameter is set to custom, you must provide the sort logic in the custom <namespace>.fileIngestion.custom::FileSort composite operator. You can provide the sort attribute in the ite.ingest.directoryScan.sort.attribute parameter or in the <namespace>.fileIngestion.custom::FileSort composite operator itself.
The input schema of the <namespace>.fileIngestion.custom::FileSort composite operator depends on the setting of the related ite.ingest.directoryScan.specialFileTime parameter.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: asc, ascending, custom, desc, descending, off
- Children: ite.ingest.directoryScan.sort.attribute
- Other: ite.ingest.directoryScan.specialFileTime
ite.ingest.directoryScan.sort.attribute
Specifies the file-sort attribute. The file-sort attribute is used by the downstream sort operator. This parameter is an enumeration parameter with the following values:
- off: No file-sort attribute is selected.
- time: The file time is used as the sort attribute and depends on the ite.ingest.directoryScan.specialFileTime parameter.
- name: The file name is used as the sort attribute.
- size: The file size is used as the sort attribute.
If the parent parameter is set to ascending or descending, this parameter is mandatory. If the parent parameter is set to custom, it is optional. If the parent parameter is set to off, this parameter is forbidden.
If this parameter is required for the application and the related ite.ingest.directoryScan.specialFileTime parameter is turned on, this parameter must be set to time.
Properties
Type: enum
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: name, off, size, time
- Parent: ite.ingest.directoryScan.sort
- Other: ite.ingest.directoryScan.specialFileTime
ite.ingest.directoryScan.specialFileTime
Enables a user-selected source for file time data. File time data is used in the file name deduplication logic to implement the eviction policy and to sort file name tuples.
The parameter is closely related to the ite.ingest.directoryScan.sort.attribute parameter.
If this parameter is set to off, the file time attribute is determined from modification time of the file object. If this parameter is set to on, the file time is determined from the file name. The file time generation is controlled by the ite.ingest.directoryScan.specialFileTime.regexp and and ite.ingest.directoryScan.specialFileTime.format parameters.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Children: ite.ingest.directoryScan.specialFileTime.format, ite.ingest.directoryScan.specialFileTime.regexp
- Other: ite.ingest.directoryScan.sort.attribute
Details
If this parameter is set to on and file name sorting is used, the ite.ingest.directoryScan.sort.attribute parameter must be set to 'time'.
ite.ingest.directoryScan.specialFileTime.format
Provides a list of date and time formats for special file-time conversion.
Formats with a '_' separator accept any kind of separator.
Properties
Type: enum
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of the following values: DDMMYYYY, DDMMYYYYhhmmss, DD_MM_YYYY, DD_MM_YYYY_hh_mm_ss, DD_MM_YYYY_hh_mm_ss_mmm, MMDDYYYY, MMDDYYYYhhmmss, MM_DD_YYYY, MM_DD_YYYY_hh_mm_ss, MM_DD_YYYY_hh_mm_ss_mmm, YYYYMMDD, YYYYMMDDhhmmss, YYYY_MM_DD, YYYY_MM_DD_hh_mm_ss_mmm, YYY_MM_DD_hh_mm_ss
- Parent: ite.ingest.directoryScan.specialFileTime
- Other: ite.ingest.directoryScan.specialFileTime.regexp
Details
If the ite.ingest.directoryScan.specialFileTime parent parameter is set to on, this parameter is mandatory. If not, this parameter is forbidden.
The cardinality of the parameter must match the cardinality of the ite.ingest.directoryScan.specialFileTime.regexp related parameter.
ite.ingest.directoryScan.specialFileTime.regexp
If the ite.ingest.directoryScan.specialFileTime parameter is set to on, this parameter is required. The values of this parameter are a list of regular expessions. The file name is tested against this regular expressions list. The first match is used and converted into a time, which overrides the file time attribute. The date and time format is used from the corresponding place in the format list that is defined in the ite.ingest.directoryScan.specialFileTime.format parameter.
Each regular expression must contain one group (pair of parentheses) that isolates the date and time from the rest of the file name. If no match is found with a particular file name, the file is considered invalid and moved to the invalid files directory.
Valid values are a comma-separated list of regular expressions that contain one pair of parentheses. A comma must not be part of a regular expression.
Example:
If a file name contains a date and time substring in the last 8 digits in front of the filename extension, for example cdr_cid1234_20120405.txt, the following regular expression can extract the date and time portion: .*_([0-9]{8}).txt$
The appropriate format parameter is: YYYYMMDD
Properties
Type: string
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of values that match the .+ regular expression
- Parent: ite.ingest.directoryScan.specialFileTime
- Other: ite.ingest.directoryScan.specialFileTime.format
Details
If the ite.ingest.directoryScan.specialFileTime parent parameter is set to on, this parameter is mandatory. If not, this parameter is forbidden.
The cardinality of this parameter must match the cardinality of the ite.ingest.directoryScan.specialFileTime.format related parameter.
ite.ingest.fileGroupSplit
Enables the file ingestion group split.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Children: ite.ingest.fileGroupSplit.pattern
ite.ingest.fileGroupSplit.pattern
Defines a regular expression that extracts the group ID from the file name. The expression must have exactly one group (a pair of parentheses), which isolates the group ID from the rest of the file name. If the file name does not match the pattern, it is assigned to the default group. The group configuration is defined in the group configuration file that is specified in the ite.ingest.loadDistribution.groupConfigFile parameter.
If the ite.ingest.fileGroupSplit parameter is set to on, this parameter is required.
Properties
Type: string
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the .+ regular expression
- Parent: ite.ingest.fileGroupSplit
- Other: ite.ingest.loadDistribution.groupConfigFile
ite.ingest.loadDistribution
Selects the distribution method for the input files to the parallel processing chains.
Properties
Type: enum
Default: equalLoad
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: equalLoad, roundRobin
- Children: ite.ingest.loadDistribution.groupConfigFile, ite.ingest.loadDistribution.numberOfParallelChains, ite.ingest.loadDistribution.udp
ite.ingest.loadDistribution.groupConfigFile
Changes the name of the group configuration file. This parameter is obsolete in variants that do not use file groups.
Relative paths are relative to the data directory.
Properties
Type: string
Default: "./config/groups.cfg"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any value that matches the .+ regular expression
- Parent: ite.ingest.loadDistribution
- Other: ite.ingest.loadDistribution.numberOfParallelChains, ite.ingest.loadDistribution.udp
Details
The following example shows the expected file format (with added whitespaces for readability):
#Group identifier, Chains per group, Maximum BloomFilter entries
"default" , 2 , 10000000
"2" , 1 , 10000000
"3" , 1 , 10000000
ite.ingest.loadDistribution.numberOfParallelChains
Defines the number of parallel processing chains for application variants that do not build groups based on file names.
Properties
Type: integer
Default: 3
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: any integer value that is equal to or greater than 1
- Parent: ite.ingest.loadDistribution
- Other: ite.ingest.loadDistribution.groupConfigFile, ite.ingest.loadDistribution.udp
ite.ingest.loadDistribution.udp
Enables the user-defined parallelism feature.
If this parameter is set to on, the number of parallel chains can be increased at job submission time with one or more submission parameter depending on the used application variant. Otherwise, the number of chains is generated at compile time and cannot be changed at submission time.
If this parameter is set to on, you need to select the distribution method roundRobin with the ite.ingest.loadDistribution parameter.
If you are using variant A or B, use the ite.ingest.loadDistribution.groupConfigFile.chains parameter. If you are using variant C, use the ite.ingest.loadDistribution.groupConfigFile.chains.00 through ite.ingest.loadDistribution.groupConfigFile.chains.99 parameters.
If the user-defined parallelism feature is used in custom code, this parameter must be turned off since nested parallel regions are not supported.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Parent: ite.ingest.loadDistribution
- Other: ite.ingest.loadDistribution.groupConfigFile, ite.ingest.loadDistribution.numberOfParallelChains
ite.ingest.reader.compression
Enables the compression parameter for the spl.adapter::FileSource operator in the specified composite operators. The default compression mode is gzip but can be changed in the <namespace>.chainprocessor.reader.custom::FileReaderCustom composite operator by setting the compression parameter for the used composite.
Enable this parameter only if your input files are compressed.
Properties
Type: enum
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of the following values: FileReaderASN1, FileReaderCSV, FileReaderStructure
ite.ingest.reader.customFileStatistics
Enables custom file statistics. To add attributes to the statistics schema, use TypesCustom::CustomFileStatisticsStreamType. If the ite.storage.type parameter is not set to 'tableFile', this parameter should be used.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
ite.ingest.reader.customParserStatistics
Enables custom parser statistics. Use TypesCustom::CustomParserStatisticsStreamType to define the parser statistic output stream type. It should be used to integrate your own parser.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
ite.ingest.reader.debug
Enables additional file outputs that are used to troubleshoot your ITE application. The files are located in the debug directory, which is a subdirectory of the configured data directory.
When you set this parameter to on, you receive information about the parsed files.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
Details
When this parameter is enabled, the following files are created:
- CHAIN_READER_FILES_IN_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for files that must be processed.
- CHAIN_READER_FILES_ACK_IN_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for files that are processed and are committed to the file name deduplication logic. The chain can then receive and process a new file.
- CHAIN_READER_FILES_APP_CTRL_IN_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for each start or stop command that is received by the chain control logic.
- CHAIN_READER_REC_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for each valid data tuple that leaves the record validation, which is the <namespace>.chainprocessor.reader.custom::RecordValidator, or, if the ite.embeddedSampleCode parameter is turned on, the <namespace>.chainprocessor.reader.sample::RecordValidator composite operator.
- CHAIN_READER_REJ_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for each rejected data tuple that leaves the record validation, which is the <namespace>.chainprocessor.reader.custom::RecordValidator, or, if the ite.embeddedSampleCode parameter is turned on, the <namespace>.chainprocessor.reader.sample::RecordValidator composite operator.
- CHAIN_READER_STAT_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives statistics log entries for each completed file.
- CHAIN_READER_STATUS_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for each status change of the chain that is initiated with a start or stop command.
- CHAIN_READER_APP_CTRL_RESP_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for each start or stop response that leaves the chain control logic.
- FILE_READER_OUT_<GROUP_ID>_<CHAIN_ID>.txt: Receives log entries for each data tuple that is sent by the FileReader composites that are specified in the ite.ingest.reader.parserList parameter.
- FILE_READER_STAT_<GROUP_ID>_<CHAIN_ID>.txt: Receives statistics log entries for each completed file. The statistics are a subset of the statistics that are stored in the CHAIN_READER_STAT_OUT_<GROUP_ID>_<CHAIN_ID>.txt file. The FileReader generates these statistics.
ite.ingest.reader.encoding
Enables the encoding parameter for the spl.adapter::FileSource operator in the specified composite operators.
Properties
Type: enum
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of the following values: FileReaderCSV
ite.ingest.reader.parserList
Enables one or more parsers and specifies the file type ids for which the parsers are responsible.
If you disable the parameter ite.embeddedSampleCode to start your customizing work, you must immediately assign your parsers to this parameter.
Properties
Type: string
Default: "*|FileReaderCustom"
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of values that match the [^|]+\|[A-Z][\w_]* regular expression
ite.ingest.reader.preprocessing
Enables file preprocessing that is used to determine attribute values once per file or to determine the file type if the file type cannot be derived from the file name.
Implement your code in the <namespace>.chainprocessor.reader.custom::PreFileReader composite operator. To activate your code, set this parameter to on. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.chainprocessor.reader.sample::PreFileReader composite operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.embeddedSampleCode
ite.ingest.reader.schemaExtensionForLookup
If this parameter is set to on, the stream schema, which is the output of the parsing and the input to the data enrichment, is extended with the attributes that are specified in the <namespace>.streams.custom::TypesCustom.LookupType type.
These additional attributes are commonly used during the enrichment. In other words, the custom lookup code assigns the enrichment data to these attributes.
If you require additional attributes to assign your enrichment data, set this parameter to on and adapt the <namespace>.streams.custom::TypesCustom.LookupType type. To activate the customized type, you must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses the customized type instead of the sample <namespace>.streams.sample::TypesCustom.LookupType type.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.embeddedSampleCode
ite.jobName
Changes the job name of the ITE application at submission time. Per default the job name is the namespace you specified during creation of the ITE project. Each ITE job needs a unique job name to communicate with the Lookup Manager. Use this parameter to launch multiple ITE applications and assign unique names to each of them during submission time. You need to ensure that each job uses a different set of input and output directories, specify the directories by using the relevant submission time parameters.
You also need to tell the Lookup Manager application which ITE jobs it needs to control, by setting the submission time parameter lm.controlledApplications to contain the desired job names.
NOTE: If you use the multihost feature, the ITE jobs still share the same hosttag definitions. These definitions are created during compile time and cannot be overwritten using the ite.jobName parameter at submission time. The names of the generated hosttags are still derived from the original namespace of the application, you setup during project creation.
As a consequence, all jobs will run on the same set of hosts, with the same host placements. If this is not the desired behaviour, it is recommended to create multiple Streams instances with different sets of hosts and submit the jobs to different instances. Alternatively you can forgo the usage of the multihost feature (do not set global.multiHost=on) and let Streams decide the host placement on its own. In that case you need to ensure that all shared resources like filesystems and lookup segments are accessible by all hosts.
Properties
Type: string
Cardinality: 0..1
Application scope: ITE
Provisioning time: submission time
Valid values: any value that matches the (?:[a-z][a-z0-9_]*)(?:\.[a-z][a-z0-9_]*)* regular expression
- Other: global.multiHost, ite.checkpointing.directory, ite.ingest.directory.input, ite.storage.directory.outputs, ite.storage.directory.statistics, lm.controlledApplications
ite.resilienceOptimization
Enables the resilience for unexpected errors.
An unexpected error is, for example, a file that is deleted while being processed or a custom business logic that accesses data arrays out of bounds. For such problems, most SPL operators or functions raise exceptions and abort the processing element.
If resilience is enabled, the ITE application catches these unexpected errors and reports them in the rejected/<input-filename>.rej.csv rejection file. The rejection file is located in the output directory that is specified in the ite.storage.directory.outputs parameter. If resilience is disabled, errors lead to unhealthy processing elements (PEs) that stop tuple processing.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
ite.storage.auditOutputs
Enables an additional processing step for file statistics that you can use to, for example, write the statistics to a database or export the statistics to another application.
Implement your code in the <namespace>.chainsink.custom::AuditTableWriter composite operator. To activate your code, set this parameter to on. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.chainsink.sample::AuditTableWriter composite operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
ite.storage.directory.outputs
Specifies the base directory for output files. This base directory may contain load, rejected, and statistics subdirectories.
A relative path is relative to the data directory.
Properties
Type: string
Default: "./out"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
- Other: ite.storage.directory.statistics
ite.storage.directory.statistics
Specifies the base directory for the statistics log files. For each file that is processed by an ITE application, an entry is written to the statistics log file. Job statistics logs are written with the date as the first part of the file name.
An archive subdirectory is created by the application and on a date switch, log files are moved to this archive directory.
A relative path is relative to the data directory.
Properties
Type: string
Default: "./out/statistics"
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
ite.storage.outputDirectoryStructure
Specifies the structure of the output directories. Output files can reside in one directory, in different subdirectories (according to the input file that created the output files), or in subdirectories that contains all the files of one day.
Properties
Type: enum
Default: allInOne
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: allInOne, perDay, perFile
ite.storage.rejectWriter.custom
If set to on, you can implement your own handling for rejected records, for example to create alarms or write different files.
Implement your code in the <namespace>.chainsink.custom::RejectWriterCustom composite operator. To activate your code, set this parameter to on. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.chainsink.sample::RejectWriterCustom composite operator.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: off, on
- Other: ite.embeddedSampleCode
ite.storage.tableNames
Configures the table names that are used in the TableFileWriter. For each table name, a dedicated spl.adapter::FileSink operator is used. If the ite.storage.type parameter is set to 'tableFile', this parameter is mandatory.
Properties
Type: string
Cardinality: 0..n
Application scope: ITE
Provisioning time: compile time
Valid values: a comma-separated list of values that match the (?:[\w$]+\.)?[\w$]+ regular expression
- Other: ite.storage.type
ite.storage.type
Selects the output type for your application.
You can specify tableFile to write CSV files, which can be consumed by another application, for example, to load the content of these CSV files into a database. Chose this type if you want to create many output files.
You can specify recordFile to write an output file for each input file.
Or, you specify custom to implement your own file writer. Implement your code in the <namespace>.chainsink.custom::FileWriterCustom composite operator. To activate your code, set this parameter to custom. You must also set the ite.embeddedSampleCode parameter to off, so the ITE application uses your implementation instead of the sample logic that is provided with the <namespace>.chainsink.sample::FileWriterCustom composite operator.
If you specify the noFile option, the ITE application does not write output files for each input file. ITE applications that use, for example, variant B or C, can select this option if <namespace>.context.custom::ContextDataProcessor creates output files only. One use case for writing output files in <namespace>.context.custom::ContextDataProcessor only, is that you need to aggregate data across files and the <namespace>.context.custom::ContextDataProcessor triggers events.
Properties
Type: enum
Default: recordFile
Cardinality: 0..1
Application scope: ITE
Provisioning time: compile time
Valid values: custom, noFile, recordFile, tableFile
- Other: ite.storage.tableNames
lm.applicationConfiguration
Specifies the application configuration instance name, that stores the DB name, DB user and the DB password data in the Streams instance or Streams domain. The user must create the instance with Streams Console.
Properties
Type: string
Default: ""
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any string
lm.commandsDirectory
Specifies the directory that is scanned for command input files. Successfully processed command input files are moved to the archive subdirectory. Input files that could not be processed are moved to the failed subdirectory. If these subdirectories do not exist, they are created during the startup phase.
A relative path is relative to the data directory.
Properties
Type: string
Default: "./in/cmd"
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
Details
The directory does not need to be in a shared file system because the Lookup Manager application always scans for new command input files on the host that has the <namespace>_lookup_writer host tag assigned.
lm.controlledApplications
Restricts the list of ITE applications that are controlled by the Lookup Manager application to a subset of the ITE applications that are defined in the LookupMgrCustomizing.xml file. The file is located in the Lookup Manager application directory.
Provide a comma-separated list of namespaces as defined in the LookupMgrCustomizing.xml file.
If the submission-time parameter is omitted, the Lookup Manager application controls all ITE applications that are defined in the LookupMgrCustomizing.xml file.
Properties
Type: string
Cardinality: 0..n
Application scope: Lookup Manager
Provisioning time: submission time
Valid values: a comma-separated list of values that match the (?:[a-z][a-z0-9_]*)(?:\.[a-z][a-z0-9_]*)* regular expression
lm.db
Specifies whether the Lookup Manager application reads enrichment data from a database source. The read enrichment data is distributed to the lookup repositories on all configured hosts.
If the Lookup Manager application reads enrichment data from a database source, set this parameter to on. If not, set it to off.
If you set this parameter to on, the child parameters must be configured according to their descriptions. If the parameter is turned off, the child parameters are inactive, and the related lm.file parameters must be turned on.
When you create a project, a connections.xml sample file is created in the application directory.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: off, on
- Children: lm.db.connectionName, lm.db.name, lm.db.password, lm.db.user, lm.db.vendor
- Other: lm.file
Details
The Lookup Manager uses the com.ibm.streams.db::ODBCRun operator from the Database toolkit to read the enrichment data. All required Database toolkit settings must be provided.
lm.db.connectionName
Specifies the connection name that will be used to access the database source.
Use one of the names that is specified in the connections.xml file of the database toolkit. The XPath for these names is /connections/connection_specifications/connection_specification/@name.
The parameter is active only if the parent parameter is turned on.
Properties
Type: string
Default: "SAMPLE"
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: any value that matches the .+ regular expression
- Parent: lm.db
- Other: lm.db.name, lm.db.password, lm.db.user, lm.db.vendor
Details
The specified connection name is passed as connection parameter to the com.ibm.streams.db::ODBCRun operator.
lm.db.name
Specifies the data source name (DSN) of the target database.
Important: If this parameter is provided as a compile time parameter, its value is visible in the SPL files that are compiled from the mixed-mode SPLMM files. To prevent security concerns, it is recommended that you provide all database access information as submission-time parameters only.
This parameter is active only if the parent parameter is turned on.
Properties
Type: string
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
- Parent: lm.db
- Other: lm.db.connectionName, lm.db.password, lm.db.user, lm.db.vendor
Details
This parameter is passed as a database parameter to the com.ibm.streams.db::ODBCRun operator.
Any value that is specified on the <ODBC> element of the <connection_specification> element in the connection.xml document is ignored.
For additional information, see the Database toolkit description.
lm.db.password
Specifies the password that is used to connect to the target database.
Important: If this parameter is provided as compile-time parameter, its value is visible in the SPL files that are compiled from the mixed-mode SPLMM files. To prevent security concerns, it is recommended that you provide all database access information as submission-time parameters only.
The parameter is active only if the parent parameter is turned on.
Properties
Type: string
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
- Parent: lm.db
- Other: lm.db.connectionName, lm.db.name, lm.db.user, lm.db.vendor
Details
This parameter is passed as a password parameter to the com.ibm.streams.db::ODBCRun operator.
Any value that is specified on the <ODBC> element of the <connection_specification> element in the connection.xml document is ignored.
For additional information, see the Database toolkit description.
lm.db.user
Specifies the user name that is used to connect to the target database.
Important: If this parameter is provided as a compile-time parameter, its value is visible in the SPL files that are compiled from the mixed-mode SPLMM files. To prevent security concerns, it is recommended that you provide all database access information as submission-time parameters only.
The parameter is active only if the parent parameter is turned on.
Properties
Type: string
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
- Parent: lm.db
- Other: lm.db.connectionName, lm.db.name, lm.db.password, lm.db.vendor
Details
This parameter is passed as a user parameter to the com.ibm.streams.db::ODBCRun operator.
Any value that is specified on the <ODBC> element of the <connection_specification> element in the connection.xml document is ignored.
For additional information, see the Database toolkit description.
lm.db.vendor
Specifies the database vendor (product). The Lookup Manager application supports only a subset of the database products (DB2 and Oracle) that are supported by the Database toolkit.
All required database toolkit settings and drivers for the selected product must be provided. For example, set the STREAMS_ADAPTERS_ODBC_DB2 environment variable for DB2 or STREAMS_ADAPTERS_ODBC_ORACLE for Oracle. All other environment variables that are required by the Database toolkit must also be set, for example, STREAMS_ADAPTERS_ODBC_INCPATH and STREAMS_ADAPTERS_ODBC_LIBPATH.
The parameter is active only if the parent parameter is turned on.
Properties
Type: enum
Default: DB2
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: DB2, ORACLE
- Parent: lm.db
- Other: lm.db.connectionName, lm.db.name, lm.db.password, lm.db.user
Details
For additional information, see the Database toolkit description.
lm.file
Specifies whether the Lookup Manager application reads enrichment data from files. The read enrichment data is distributed to the lookup repositories on all configured hosts.
If the Lookup Manager application reads enrichment data from files, set this parameter to on. If not, set it to off.
If this parameter is turned on, the child parameters can be configured according to their descriptions. If the parameter is turned off, the child parameters are inactive and the related lm.db parameters must be turned on.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: off, on
- Children: lm.file.directory, lm.file.eolMarker, lm.file.ignoreEmptyLines, lm.file.ignoreHeaderLines, lm.file.quoted, lm.file.separator
- Other: lm.db
lm.file.directory
Specifies the directory that holds enrichment data input files.
A relative path is relative to the data directory.
Enrichment data input files have either the .csv or .del.csv extension.
The parameter is active only if the parent parameter is turned on.
Properties
Type: string
Default: "."
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
- Parent: lm.file
- Other: lm.file.eolMarker, lm.file.ignoreEmptyLines, lm.file.ignoreHeaderLines, lm.file.quoted, lm.file.separator
Details
The basename (the file name without the extension) is a segment name that is provided as part of an update or delete command. The provided segment name must match one of the segment names that are defined in the LookupMgrCustomizing.xml file. The file is located in the Lookup Manager application directory.
lm.file.eolMarker
Specifies the end of line marker of the CSV lines when the Lookup Manager reads the enrichment data from files.
Properties
Type: string
Default: "\n"
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any string
- Parent: lm.file
- Other: lm.file.directory, lm.file.ignoreEmptyLines, lm.file.ignoreHeaderLines, lm.file.quoted, lm.file.separator
Details
This parameter value defaults to "\n". Valid values include strings with one or two characters, such as "\r" and "\r\n". For more details, look at the eolMarker parameter in the FileSource reference.
lm.file.ignoreEmptyLines
Specifies how to handle empty CSV lines when the Lookup Manager reads the enrichment data from files. If this parameter is turned on, empty lines are dropped else empty lines produce tuples with default attribute values. If this parameter is turned off then multiple empty lines cause warnings during the write process to lookup repositories.
Properties
Type: enum
Default: on
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: off, on
- Parent: lm.file
- Other: lm.file.directory, lm.file.eolMarker, lm.file.ignoreHeaderLines, lm.file.quoted, lm.file.separator
lm.file.ignoreHeaderLines
Specifies how to handle header lines in the CSV input file that contains the enrichment data. A header line is the first line after punctuation. If this parameter is turned on, then the header is dropped else the header is handled like a normal line.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: off, on
- Parent: lm.file
- Other: lm.file.directory, lm.file.eolMarker, lm.file.ignoreEmptyLines, lm.file.quoted, lm.file.separator
lm.file.quoted
Specifies the quoting mode for attribute values of the CSV input file that contains the enrichment data. If the parameter is turned on then some or all values are quoted else no one value is quoted. If you are sure that no input value is quoted, turn off the parameter to improve performance.
Properties
Type: enum
Default: off
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: off, on
- Parent: lm.file
- Other: lm.file.directory, lm.file.eolMarker, lm.file.ignoreEmptyLines, lm.file.ignoreHeaderLines, lm.file.separator
lm.file.separator
Specifies the quoting mode for attribute values of the CSV input file that contains the enrichment data. If the parameter is turned on then some or all values are quoted else no one value is quoted. If you are sure that no input value is quoted, turn off the parameter to improve performance. The valid values is a string with single- or the multi-character.
Properties
Type: string
Default: ","
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time
Valid values: any string
- Parent: lm.file
- Other: lm.file.directory, lm.file.eolMarker, lm.file.ignoreEmptyLines, lm.file.ignoreHeaderLines, lm.file.quoted
lm.statisticsDirectory
Specifies the directory that is used to store log and statistics files.
The Lookup Manager application collects statistics for the lookup repository, for example the amount of available memory, and generates log information, for example the starting and ending times of processed commands. The Lookup Manager application writes this information to a file, <date>_LookupManagerStatistics.txt with a YYYYMMDD date format.
The specified directory holds only one statistics log file. If a new file is created because the new day begins, the old file is moved to the archive directory. The archive directory is created during the startup phase.
A relative path is relative to the data directory.
Properties
Type: string
Default: "./out/statistics"
Cardinality: 0..1
Application scope: Lookup Manager
Provisioning time: compile time, submission time
Valid values: any value that matches the .+ regular expression
Details
The directory does not need to be in a shared file system because the Lookup Manager application always runs on the host that has the <namespace>_lookup_writer host tag assigned.