Big data can be essential when making business decisions, but sometimes bias (whether from humans or the data itself) can skew results and detrimentally affect decision-making. Lisa Morgan at InformationWeek has explained seven common types of bias which could influence the way you present your data and draw conclusions.

Here’s a summary of her presentation:

Confirmation Bias

When incorrect conclusions, such as a causal link, are drawn from data, due to the analyst’s desire to ‘prove a hypothesis, assumption, or opinion’.

Selection Bias

When the data itself is selected on a subjective or non-random basis, thus manipulating the results.

Outliers

This is data that is wildly outside the normal distribution and can lead analysts to an incorrect conclusion if they allow it to influence ‘the bigger picture’.

Simpson’s Paradox

An intriguing bias, in which a trend shown across different groups of data reverses or vanishes when these groups are combined.

Overfitting and Underfitting

When a data model is too complex and emphasises unnecessary data (overfitting) or too simple, missing out important information (underfitting).

Confounding Variables

Factors which correlate to both the dependent and independent variables in a data model, but can often be overlooked.

Non-Normality

It’s easy to assume that normal distribution (or a bell curve) exists in a set of data, an assumption which some tests actively encourage. However, in the case of non-normal data, these tests can provide skewed results.

Find out more about these different types of bias and how they can affect decision-making here.