Data Processing
- Grouping with tuple deduplication
- If you want to group your tuples and get rid of duplicates in your data, you can configure your application easily to do so.
- Grouping with custom correlation
- If you want to group your tuples and implement your own business logic to correlate the data tuples, you can configure your application easily to do so.
- Grouping with custom correlation and tuple deduplication
- If you want to group your tuples, implement your own business logic to correlate the data tuples, and get rid of duplicates in your data, you can configure your application easily to do so.
- Processing files without group processing (Variant A)
- When your data does not require record deduplication or group processing, you use the framework in its most simple configuration.
- Processing files with file content based group processing (Variant B)
- When you must process files that contain group information as part of the tuple contents, use a configuration that bases the group split on file contents, and each tuple determines the context.
- Processing files with file name based group processing (Variant C)
- When you have a use case where the names of files contain information to determine the group, you configure the framework to send scanned files to the group-specific processing chains.
- Enabling data enrichment
- When you want to enrich your incoming data with already known data, you must enable the lookup function in your application.
- Using checkpointing for custom logic
- If you want to preserve the state of your customizations or the deduplication over a restart of your application, you must enable checkpointing for the custom or deduplication logic or both.
- Cleaning up data
- The framework provides parameters to define regular automated clean-up cycles to get rid of old data, and thus spare resources.
- Tapping internal data streams
- If your use case needs to process the data on an extra path apart from the standard data flow, the framework provides two taps that you enable by parameters.
- Rejecting tuples by using your own implementation
- If you must have a special business logic to reject data records, you enable a composite to implement your rejection logic.