Finding file duplicates

Occasionally, input files arrive more than once in your application’s landing zones. You can configure your application to ignore such duplicate files by setting a few parameters.

About this task

Enable detection of duplicate input files by using parameters and hinder feeding those duplicates into your application more than once.

Procedure

  1. In the file <PathToYourApplication>/config/config.cfg, find the ite.ingest.deduplication parameter description
  2. To enable duplicate detection, set the parameter to on as follows (in fact ‘on’ is the default value): ite.ingest.deduplication=on

  3. In the file <PathToYourApplication>/config/config.cfg, find the ite.ingest.deduplication.timeToKeep parameter description

  4. To define the number of days the application checks for duplicate files, set the parameter to the number of wanted days, for example: ite.ingest.deduplication.timeToKeep=7d