BlogData Science & AnalyticsQA/Test Automation

When Data Analytics meets Quality Assurance

By November 14, 2019 November 20th, 2019 No Comments

Mistakes were made. Let’s acknowledge them

The Atlantbh Data Analytics team went through a long and strenuous process of change over the past few years. This evolution, which saw us move from writing reports in Word documents to using complex systems for analysis, was bound to produce mistakes. However, the approach you take in dealing with those mistakes is what sets problem-solvers apart from problem-avoiders.

There are essentially two approaches you can take; you can either treat a mistake as your enemy or treat it as your friend. This means you can try to fight it, to hate the fact it appeared in the first place, and let it affect your self and team confidence in a negative manner, or you can acknowledge its existence, try to explain it, i.e. find a pattern in its appearance, and use it to boost your and your team’s learning curve.

 

Mistakes were made. Let’s learn from them

When you search for the best ways to solve problems and learn, several things consistently appear. Our team’s quality assurance evolution intuitively followed these rules.

First of all, when you are focused on a problem and dive head first into it, it’s easy to miss the obvious. We tried to overcome this by designating a team member to verify the report someone else had written. However, an even better way of noticing flaws in an analysis approach or overall logic is by considering multiple perspectives. This is why we introduced a multi-peer-review practice whereby two team members verified a written report produced by a third colleague. Subsequently, the whole analysis flow performed better.

Secondly, education practitioners unanimously recommend taking notes. We applied this concept by creating a system where we record a mistake when it appears for the first time, then each time it reoccurs we mark this in the system. Our rule of thumb is: If something erroneous is appearing frequently, there is something wrong with our overall processes. That is the moment when reevaluation and modification of processes occur, with the aim of decreasing the frequency of a given mistake, or stopping it from repeating altogether.

 

Mistakes were made. Let’s not repeat them

We discussed three steps in dealing with mistakes in a constructive manner: acknowledging them, taking notes, and explaining their occurrence. These three steps are crucial in overcoming the flaws in the process, however, they are simply a means to achieving a goal. In this context, our goal was and still is to stop already detected mistakes from reappearing, which should result in the evolution of our overall process.

But first let’s see how we approached the third of the aforementioned steps, explaining the mistakes, i.e. identifying the reasons for their occurrence. We analyzed the overall statistics in the system of mistake tracking I mentioned. Specifically, we sorted the mistakes by frequency, from the most frequent to least frequent mistake. Then we tried to categorize these mistakes into conceptually similar types.

After looking into the results, we found that the majority of mistakes are connected to parts of the process that are routinely performed and that require repetitive work. This represents a less fancy part of the data analytics job, for example, not noticing that some characters are encoded incorrectly, copy-and-paste errors while writing reports, calculation errors, report format errors, etc.

In order to overcome this, the whole team took part in creating a system that would cover the widest possible range of steps needed to prepare the dataset for analysis. After identifying those steps, we classified them into groups and created checklists using those groups as a guideline. This system helps us in two major aspects of our work. On the one hand, it prevents mistakes from reoccurring, and on the other, it leaves room for an analyst to perform the more complex parts of the process without being afraid of forgetting those little steps along the way. Finally, this system is used as a means of gathering feedback for new team members, by analyzing comments on skipped or not well performed steps, written by senior teammates. Also, it can be used as a learning guide, since having an overview of errors already made should help identify one’s own mistake, explain its occurrence, and prevent it from happening again.

We hope that these lessons learned will help your data analytics team evolve too, minimizing mistakes and ensuring a higher quality of data analysis.