What does it do?
Allows you to handle duplicate records within a dataset.
The module gives you a high degree of control over the deduplication method and what to do with duplicate records.
Settings – Dedupe Columns
|Column||Defines which column to base the deduplication on||You can select more than one column. Two records are considered duplicates if all selected columns match in each record|
Settings – Options
|Dedup Behaviour||Defines what to do if a duplicate record is found||Determines which records are excluded in the case of a duplicate (e.g. exclude first, exclude randomly, etc…)|
|Output Type||Defines the method for outputting duplicates||You can output records into a separate output or mark duplicates with a flag|
|Missing Column Behaviour||Defines the behaviour in the event of a missing dedupe column|
Tips & Tricks