Bad Data, Outrageous Bots, the Meaning of Life & AI
Whilst we all know bad data to be an issue; the extent to which it is an issue and is set to become more of an issue with AI hasn’t really been covered in the stark terms it should be. The big money still lies with the transactional software systems and to be frank, their priority is not master data. And yet, we are entering the age of data. The next step in the technological revolution.
To provide some background to this assertion, every two years we create 10 times the amount of data that has been created in the previous 30 years. Or if you look at the data in a different way, 90% of all data has been created in the last two years. These stats are from 2013 so you can be sure that this has accelerated significantly in the last five years.
There are two other stats I’d like to bring to your attention:
- At the moment, less than 0.5% of all data is ever analysed and used
- Retailers who leverage the full power of big data could increase their operating margins by as much as 60%.
If you are reading this, it is likely that you work in the world of procurement. With that in mind it is quite possible that your interest is in supplier master data, not big data being used by retailers. After all, they will have far more data on their customers than a procurement department will be able to collect on their suppliers. And yet, it is an important statistic in that it shows the game changing ability of data when used at scale.
In order for organisations to take advantage of the opportunities of data, they need to input it into AI systems, more specifically, Machine Learning or Deep Learning algorithms. Once this is done, they can begin to take advantage of exciting new outputs such as Robotic Process Automation, Predictive Analytics, Risk Analysis and numerous other efficiencies. I’m sure you have read about a fair few yourself.
Whilst we are talking about opportunity, let me introduce you to this statistic:
So in short, there is a vast, blank canvas that numerous departments across organisations globally can look to take advantage of. An opportunity to beat the opposition. In procurement specifically, 45% of CPO’s believe quality of data is one of the main barriers to the effective application of digital technology.
The focus of this article though, is not the opportunity in itself, it is to consider what happens when the plan of action in relation to new AI technologies isn’t thought through properly – seemingly by 97% of companies. The Towards Data Science reports, ‘As AI-powered technologies become more prevalent, and the quality demands of ML (Machine Learning) become clearer, the “garbage in, garbage out” principle from the early days of computing is suddenly incredibly relevant.’
Thomas C. Redman, one of the leading lights in data quality management, goes further, “Poor data quality is enemy number one to the widespread, profitable use of machine learning.”
At times, those of us who constantly expound on the importance of building a solid platform of high quality data first can often feel like we are banging our heads against a brick wall. Perhaps the answer is to provide more examples of just how outlandish ‘garbage out’ can be.
With that in mind, these examples should help:
- Is the LinkedIn search function sexist?
- Data issues
- A Nazi Microsoft Chatbot
- Data issues
- Wiki bots arguing in perpetuity
- Data issues
- Uber Self driving, running a red light
- Data issues
- Bots arguing about the meaning of life
- Data issues
If the data itself is prejudiced or flawed then the outcome is bound to be the same. Only now it’s not just flawed data its flawed decision making and quite possibly very dangerous outcomes for a company. Either that, or other companies will be taking advantage of these new innovations, doing it well and eating up the market share of their competition.
The good news though, as capably demonstrated by the examples above, is that there is time to prepare for the coming opportunities that AI will bring and by then, we will have even more data.
Petabytes and petabytes. Let’s hope it’s clean and ready, underlined by the crucial backbone of master data.
Posted in