Building another storage area is an important initiative as long as the purpose is to industrialize the Process Mining usage and follow up. In another words this initiative is mandatory if the purpose is to scale. Indeed a Process Mining must not be a one-shot project finishing when the process has been improved. As we saw in the DMAIC methodology (Lean Six Sigma) it’s equally important to control and monitor the process along its life.
As a direct consequence that implies we’ll need to integrate the data on a regular basis. So all the Data quality checks and all the Data preparations we’ve seen before will be re-iterate also on the same regular basis. To manage that in an efficient way we’ll need at least to set up or leverage:
- A storage area (Data Base, File System)
- Many Data Pipelines to feed the storage area accordingly
- A dedicated and specialized team (Data Integration team) in charge of managing the two elements above
- A Data Governance policy
This is an Enterprise Process Mining Grade initiative and is then a real strategic program. We can compare this implementation to a tactical approach that most of the companies had put in place currently when they just launch a one-shot and isolated Process mining project.
The objective here is clearly to put in place a solid and timeless company initiative which will be capable of managing and monitoring the enterprise’s processes.
These are some characteristics/consequence of setting up such implementation:
- Data must be controlled and provided by IT (connectivity / accessibility)
- Complex flows/transformations needed
- Performance vs Volume management is possible
- Operational Monitoring requested (Need to manage different latencies)
- Enable Scalability
- Open architecture as the Process Mining and data storage solutions can be independant
- The Skills and duties are separated (between the people in charge of the data and those who analyze the processes). This brings more consistency and flexibility.
These are some basic benefits we can get by going through this approach:
- Business user can be fully autonomous when leading their Process Mining discoveries and monitoring (aka Citizen Process Miner)
- Data can be accessible from many kinds of formats and so several Process Mining Solution (file / DB JDBC, etc.)
- Once picked the data from the Process Warehouse would need no or just some very simple transformation (Rules & join only)
- No Data Ops (no need to export/import the transformations pipeline, versioning, etc.)
Most of the time a Process Warehouse is a Database in which we can create for each Process Mart (see later) a specific table. But it’s also possible to use the File System as the needed information to provide to the Process Mining Solution can be summarized as a single table with at least 3 fields (more if optionals information):