"Data deduplication refers to the elimination of redundant data. In the deduplication process, duplicate data is deleted or linked together, leaving only one copy of the data to be stored" (Ellicium).
"Deduplication and data linkage are important tasks in the preprocessing step for many data mining projects. It is important to improve data quality before data is loaded into a data warehouse (International Journal of Computer Applications).
A definitive guide to data definitions and trends, from the team at Stitch.