What is Dark Data?

Every company collects, processes, and stores data that they don’t use, this is referred to as “dark data”. Some might say that this is because the data has no use, but others, those who are more likely to be working in data driven companies, will say that ignoring dark data is a missed opportunity. The more data a company uses to understand its business, the more accurate the information used to provide insights will be. 

Our Founder, Ian Cray, is an advocate of using every last element of data to make companies think differently about how they do business, what opportunities they may have for improving performance and delivering future growth. Below Ian shares his thoughts on dark data by making an interesting comparison to one of the “coolest” areas of medical science.  

Dark Data is Insight’s Cryonics

Cryonics is all about preserving a human body in the hope that one day in the future technology will have advanced sufficiently to resuscitate it. At that stage, it can be cured of illnesses, rejuvenated, and given an extended life span.

Dark Data is insight’s Cryonics. Over the last few years, businesses have generated so much data so fast that they have not been able to process it. Instead, with the advent of affordable cloud solutions, they have effectively placed their data in cold storage, waiting for technology to catch up. 

In the same way that a frozen body requires protection, our Dark Data needs care too – in the form of security and data quality checks. We need to be sure that when we freeze this data, it is in as good a shape as possible – no need to put more pressure on future tech! This ever-growing glut of data also needs to be kept secure.

Laws and ethics are changing in the Cryonics field, which impacts where bodies can be stored, who can request the process and who determines when vitrification can take place. In the data world, there are similar changes with a focus on data governance and rules on which data we are allowed to keep. 

That’s probably as far as I can take this analogy which is a good thing; otherwise, we would not likely see our data for many decades! We are a lot closer to being able to vitrify our data than the Cryonics field, and we have two significant advantages.

Firstly, we can inspect and modify our data while it is frozen. We can remove the ROT (Redundant, Outdated and Trivial) data to reduce costs and catalogue what data we have. Actually, it is essential that we catalogue our Dark Data to ensure we are compliant with regulations such as GDPR and retention rules. 

Secondly, we can work on a copy of our data or use a backup in case we mess up the vitrification process, which is certainly not an option in Cryonics. However, we still need to bring it back with care. We need to have a deep understanding where it came from, how it inter-relates, who owns it and how accurate it is. And yes this is how we should treat all data, all the time so every piece is part of our data DNA.

Now is the time to tuck into that Dark Data, catalogue and understand what you have, remove the ROT and let the Data Scientists loose!