Big data can be essential when making business decisions, but sometimes bias (whether from humans or the data itself) can skew results and detrimentally affect decision-making. Lisa Morgan at InformationWeek has explained seven common types of bias which could influence the way you present your data and draw conclusions.
Here’s a summary of her presentation:
When incorrect conclusions, such as a causal link, are drawn from data, due to the analyst’s desire to ‘prove a hypothesis, assumption, or opinion’.
When the data itself is selected on a subjective or non-random basis, thus manipulating the results.
This is data that is wildly outside the normal distribution and can lead analysts to an incorrect conclusion if they allow it to influence ‘the bigger picture’.
An intriguing bias, in which a trend shown across different groups of data reverses or vanishes when these groups are combined.
Overfitting and Underfitting
When a data model is too complex and emphasises unnecessary data (overfitting) or too simple, missing out important information (underfitting).
Factors which correlate to both the dependent and independent variables in a data model, but can often be overlooked.
It’s easy to assume that normal distribution (or a bell curve) exists in a set of data, an assumption which some tests actively encourage. However, in the case of non-normal data, these tests can provide skewed results.
Find out more about these different types of bias and how they can affect decision-making here.