Checking the row duplicates is the purpose of the CONTROL_M4. At this stage we need to check if the tuple (PFI-KEY, SN-KEY, T-KEY) is unique (first chart below). We can also check the unicity (like mentioned before) of the other fields:
Finding some duplicates here would mean:
- There was some issue in gathering the data.
- One or more of the 3 tuple’s fields are not the best candidates. We may need to review this with the business.
- There’s no business problem. Sometimes systems or applications pull data logs regularly so in this case it’s normal to get the same information many times. In this case it’s a good practice to check if the other fields are the same or not.
- If they are the same →filter out the duplicates as they are exactly the same
- If they are not the same → Check these out with the business to see what kind of action to do with these data