Checking the existence
The screenshot below shows the NULL detection we must perform for the T-KEY (the field Date):
Table Profiling with Ataccama Data Quality Analyzer
Like the PFI-KEY and T-KEY, this field must not be empty. In the case above we have 1% of the Dataset with T-KEY as NULL.
We’ll have to see with the business user what to do with these 1% of records:
- Do we filter out these rows ?
- Do we specify ourselves a fixed date (like the first or the latest one of the same Process Flow) ?
- Do we have in the other fields another date and time we could use ?
Checking the unicity
Checking the unicity does not really make sense for this field. The only suspect point is when having too many duplicates. In this cas a business review may be necessary.
The Data format
As the date field can come from different data sources, they can also have different formats. Date formats are always a tricky part when managing data Integration, so profiling the date format is a great practice to ensure we’ll have an accurate date as the T-KEY.
As example ABBYY Timeline accepts these formats:
So checking we have our T-KEY field which respects these formats is a prerequisite.