How many times per day do you share information about yourself? No need to dig deep – how many times do you insert your email address on websites? Indeed, today virtual privacy is as important as keeping your house door locked at night.
Back in 2018, the privacy topic got to the front pages of news and magazines – one of the global corporations used the client’s data in an illegitimate way. Owning information of approx. 2.45 billion active users at that time, Facebook allowed data leak of 87 million users without their consensus.
Inspired by this and similar cases, the European Authorities created a General Data Protection Regulation equal data security for the EU citizens. You might remember how the new policy shook up the vendors of all sectors in May 2018. Businesses had to re-structure the usage of the contact database and rethink the way the privacy is treated.
Today any new software implementation requires a thorough check-up for safety. It’s a top priority to prevent customer and employee data leaks and reduce the risk of fraud. How do we manage this in the case of process mining?
Process mining is an innovative technology. It enables you to construct a reliable visual process based on your company’s existing “digital traces”. And it does so by extracting the information from your IT systems and combining it. Thus, looking at your processes through a process mining tool, you get a new perspective on what’s going on. This is a strong core for further analysis and optimization.
By looking at what is actually happening in your process, you can drive straight-forward conclusions and take action. All in all, process mining is a proven way to increase efficiency and reduce complexity.
When you, as a process owner, understand the need for transparency and decide to sharpen the workflows, you will first consider:
- what is the data you’ll need,
- how to extract that data,
- how to assure its safety throughout the process.
As the process mining tools work with corporate information, keeping the privacy of stakeholders under control is challenging. Indeed, corporate data does not only contain information on performed activities but also about people: names, addresses, positions, departments, user ID’s, etc. Everyone involved in the IT system is sharing their personal data, information on organizational structure and cooperation between departments. This information is interesting to the outside world.
The Privacy by design principle helps keeping a healthy balance between the transparency of a dataset and stakeholders’ privacy. This approach to engineering encourages technology providers to put privacy before everything else, by default. Thus, privacy must be treated in a proactive and preventive manner. In other words: think about privacy before developing anything at all.
Privacy decisions for process mining implementation
To ensure that the newly implemented technology does not violate the personal data policy, there are several points to consider.
First: Access to raw data
As the first step – think of data extraction from the IT systems or data warehouse. The process mining implementation team needs to have access to the corporate data, helping you extract what’s most important for analysis. Thus, it’s your responsibility to acknowledge what data you want to analyze further and grant access. This will also speed up and simplify the implementation. We need to be on the same page, don’t we?
Second: Filter, pseudonymize, anonymize
Moving on! The process mining implementation team works on a translation of the raw data. They convert it into comprehensive terms and a format suitable for process mining. Prepared data then gets transformed into dashboards, and you, as the user, will decide what features to focus on. At this point, there are various ways to go about personal information: filter, anonymize or pseudonymize.
Sometimes you track information that’s not needed for specific process analysis. Thus, the team would simply recommend removing it. So, if data is sensitive and does not influence the business analysis outcome – get rid of it. Remember, at this point, we take it seriously – we want to analyze only valid and relevant data.
The most common way of handling sensitive data. Simply put, it’s a way of encrypting the information so that the platform users can’t relate it to specific names/positions.
By looking at sensitive data conserved, we replace the information with the pseudonyms where possible. Imagine you don’t want all the analysts to see the names of employees that perform process-related tasks. To keep it discrete we replace the names with numbers. This way the individual performance is hidden. Only you or a dedicated group of people will be able to access the table of translations and identify names. Of course, it’s always possible to modify the access to the translations – in case of fraud or change IT can grant access to retranslate the pseudonyms.
Often during the process analysis, you would like to discover the interactions between employees, rather than the individuals. In this case, we replace personal information with more generic descriptions, such as workgroup, department, entity, etc. Even if the workgroups are as small as two people, we keep this in mind and pseudonymize.
This way we prevent random conclusions, embracing the top management to act strategically. However, people can still be identified using the data (although it becomes a lot harder). Therefore, for some regulations as GDPR for instance, pseudonymization is not enough – the information is still accessible but in a more secure way.
Quite identical to the pseudonymization procedure. Like in the previous method, we replace the data with unique pseudonyms, but this time there is no “translation table”. Thus, the original entry is safe and “locked”, as the re-anonymization is impossible. In this way people cannot be identified anymore using the available data.
The trick with faceless data is how it complicates the process analysis. There is a risk of over- securing the data so that the insights are not anymore useful.
Access to finalized data
When the data is ready and final, it’s important to think of the access again. This time you are defining the business users who will have access to the dashboards, process graphs, and reports. This information still represents a valuable corporate asset, isn’t it?
In fact, this is where the Governed Self Service capability of your chosen process mining tool comes in handy. With GSS you can grant major access to the IT team, at the same time giving enough space to the business analysts. Both can work towards their goals independently, analyzing common truth.
It’s all about agreeing on what’s relevant to you
Just like everything in life, balance is the key. Especially when it comes to a balance between the quality of analysis and privacy. It’s important to guarantee your employees and customers the safety of their personal information, at the same time analyze the data that went through the least of changes. Set privacy as a priority for any project and create the plan accordingly to how you want to see it.
And there you go – now you can analyze the processes and feel comfortable privacy-wise!