Data ingestion

Identifying input files
Applications read files from the configured input directory or directories.
Sorting input files by file name
When your business logic relies on a certain order to read the input files, you configure scanning of the directory to report the identified files in a sorted sequence.
Sorting input files by file size
When your business logic relies on a certain order to read the input files, you configure scanning of the directory to report the identified files in a sorted sequence.
Sorting input files by file time
When your business logic relies on a certain order to read the input files, you configure scanning of the directory to report the identified files in a sorted sequence.
Sorting input files by special file time
When your business logic relies on a certain order to read the input files, you configure scanning of the directory to report the identified files in a sorted sequence.
Finding file duplicates
Occasionally, input files arrive more than once in your application's landing zones.
Using load distribution to distribute files
To increase the throughput of your application, you distribute the detected input files to many processing chains, which all work on the data in parallel.
Distributing files to processing chains defined on job submission
To increase the throughput of your application, you distribute the detected input files to many processing chains which all work in parallel on the data.
Using file group split to distribute files
To increase the throughput of your application, you distribute the detected input files to many processing chains which all work in parallel on the data.
Choosing a parser
To read the data from the input files, you need to configure a parser.
Using many parsers
Sometimes a single parser does not do the job, for example in use cases where the logic needs to read many different file types.
Activating the compression parameter for file readers
Sometimes the input files you get are compressed and you need to extract them before processing.
Activating the encoding parameter for file readers
Sometimes the text input files you get use a different encoding and you need to configure your reader.
Using file preprocessing
If you cannot derive the file type easily from the file name and some more processing is necessary, the framework provides a preprocessing class that you customize to your needs.